Closed disconnect3d closed 1 day ago
The open() needs to set the encoding use a better detection strategy or support utf-8 or let the user set it.
Here are the work arounds for when open() is getting the encoding from guess about the environment: https://stackoverflow.com/questions/36303919/what-encoding-does-open-use-by-default
I'd recommend setting the encoding explicitly as most linters recommend.
def visit_abstract_syntax_trees(self) -> None:
for file_path in self.filenames:
with open(file_path, encoding="utf-8") as f:
The open() needs to set the encoding use a better detection strategy or support utf-8 or let the user set it.
The error explicitly says "utf-8 codec can't decode byte ..." which means it attempted to read the file in utf-8 and failed. I doubt you can automagically detect encoding for an arbitrary file.
The best course of action may be just reading the file in binary form and operating on that?
Fwiw:
File content:
b'# coding: iso-8859-5\n# (Unlikely to be the default encoding for most testers.)\n# \xb1\xb6\xff\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef <- Cyrillic characters\nu = "\xae\xe2\xf0\xc4"\n'
EDIT: Huh in this case the file specifies its encoding... :)
@disconnect3d Thank you for your suggestion. I completely agree that files should be analyzed in binary. I have implemented this change via 4ab07e9 and released it in 2.3.1 version.
@albertas Awesome, thanks!
Hi, the
deadcode
tool crashes when it encounters non utf-8 file.TL;DR:
This occurred when it tried to parse the following file:
.venv/lib64/python3.11/site-packages/IPython/core/tests/nonascii.py
which can be found here: https://github.com/ipython/ipython/blob/main/IPython/core/tests/nonascii.py