During a run of pdoc3, I was confronted with a python file with a comment containing non-utf8 characters. While this shouldn't have happened, pdoc3's current behavior is to raise an exception without giving any clue about which file being the culprit:
Traceback (most recent call last):
File "...\pdoc3\pdoc\__main__.py", line 6, in <module>
main()
File "...\pdoc3\pdoc\cli.py", line 534, in main
modules = [pdoc.Module(module, docfilter=docfilter,
File "...\pdoc3\pdoc\cli.py", line 534, in <listcomp>
modules = [pdoc.Module(module, docfilter=docfilter,
File "...\pdoc3\pdoc\__init__.py", line 754, in __init__
m = Module(import_module(fullname),
File "...\pdoc3\pdoc\__init__.py", line 675, in __init__
var_docstrings, _ = _pep224_docstrings(self)
File "...\pdoc3\pdoc\__init__.py", line 269, in _pep224_docstrings
_ = inspect.findsource(doc_obj.obj)
File "C:\Python39\lib\inspect.py", line 831, in findsource
lines = linecache.getlines(file, module.__dict__)
File "C:\Python39\lib\linecache.py", line 46, in getlines
return updatecache(filename, module_globals)
File "C:\Python39\lib\linecache.py", line 137, in updatecache
lines = fp.readlines()
File "C:\Python39\lib\codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe4 in position 2403: invalid continuation byte
Adding UnicodeDecodeError to the list of excepted exceptions in _pep224_docstrings() will change the output to this, which includes the name of module causing the problem:
...\pdoc3\pdoc\__init__.py:754: UserWarning: Couldn't read PEP-224 variable docstrings from <Module 'unifiedmodel.enum'>: 'utf-8' codec can't decode byte 0xe4 in position 2403: invalid continuation byte
m = Module(import_module(fullname),
...\pdoc3\pdoc\__init__.py:754: UserWarning: Couldn't read PEP-224 variable docstrings from <Class 'unifiedmodel.enum.Entry'>: 'utf-8' codec can't decode byte 0xe4 in position 2403: invalid continuation byte
m = Module(import_module(fullname),
...\pdoc3\pdoc\__init__.py:754: UserWarning: Couldn't read PEP-224 variable docstrings from <Class 'unifiedmodel.enum.Enum'>: 'utf-8' codec can't decode byte 0xe4 in position 2403: invalid continuation byte
m = Module(import_module(fullname),
During a run of pdoc3, I was confronted with a python file with a comment containing non-utf8 characters. While this shouldn't have happened, pdoc3's current behavior is to raise an exception without giving any clue about which file being the culprit:
Adding UnicodeDecodeError to the list of excepted exceptions in _pep224_docstrings() will change the output to this, which includes the name of module causing the problem: