pdoc3 / pdoc

:snake: :arrow_right: :scroll: Auto-generate API documentation for Python projects
https://pdoc3.github.io/pdoc/
GNU Affero General Public License v3.0
1.11k stars 143 forks source link

Error if an extensition has been compiled for several python version. #407

Open nennigb opened 1 year ago

nennigb commented 1 year ago

Expected Behavior

I am documenting a package containing fortran extension build with f2py/gfortran/gcc on linux. Once the package is installed the root folder contains mypkg.cpython-39-x86_64-linux-gnu.so which is imported in the __init__.py file. If I run pdoc, it works fine ;-)

# Here I use py39
pdoc3 --force --config latex_math=True --html pypolsys
/home/xyz/anaconda3/envs/work/lib/python3.9/site-packages/pdoc/__init__.py:754: UserWarning: Couldn't read PEP-224 variable docstrings from <Module 'pypolsys.polsys'>: source code not available
  m = Module(import_module(fullname),
html/pypolsys/index.html
html/pypolsys/polsys.html
html/pypolsys/test.html
html/pypolsys/utils.html
html/pypolsys/version.html

I am using several python version (the package are installed in editable mode) and I can have also mypkg.cpython-38-x86_64-linux-gnu.so or mypkg.cpython-310-x86_64-linux-gnu.so in the same place. Then pdoc crashes.

# Here I use py39
pdoc3 --force --config latex_math=True --html pypolsys
/home/xyz/anaconda3/envs/work/lib/python3.9/site-packages/pdoc/__init__.py:754: UserWarning: Couldn't read PEP-224 variable docstrings from <Module 'pypolsys.polsys'>: source code not available
  m = Module(import_module(fullname),
Traceback (most recent call last):
  File "/home/xyz/anaconda3/envs/work/lib/python3.9/site-packages/pdoc/__init__.py", line 222, in import_module
    module = importlib.import_module(module_path)
  File "/home/xyz/anaconda3/envs/work/lib/python3.9/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 981, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'pypolsys.polsys.cpython-310-x86_64-linux-gnu'; 'pypolsys.polsys' is not a package

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/xyz/anaconda3/envs/work/bin/pdoc3", line 8, in <module>
    sys.exit(main())
  File "/home/xyz/anaconda3/envs/work/lib/python3.9/site-packages/pdoc/cli.py", line 534, in main
    modules = [pdoc.Module(module, docfilter=docfilter,
  File "/home/xyz/anaconda3/envs/work/lib/python3.9/site-packages/pdoc/cli.py", line 534, in <listcomp>
    modules = [pdoc.Module(module, docfilter=docfilter,
  File "/home/xyz/anaconda3/envs/work/lib/python3.9/site-packages/pdoc/__init__.py", line 754, in __init__
    m = Module(import_module(fullname),
  File "/home/xyz/anaconda3/envs/work/lib/python3.9/site-packages/pdoc/__init__.py", line 224, in import_module
    raise ImportError(f'Error importing {module!r}: {e.__class__.__name__}: {e}')
ImportError: Error importing 'pypolsys.polsys.cpython-310-x86_64-linux-gnu': ModuleNotFoundError: No module named 'pypolsys.polsys.cpython-310-x86_64-linux-gnu'; 'pypolsys.polsys' is not a package

It is noteworthy that arbitrary .so file like abcd.so is also a problem. We could expect that pdoc ignore such additional .so file (filter by python version, just pick one, ...)

Steps to Reproduce

  1. Pick a python package with fortran extension
  2. Copy and rename the .so with another python version to add a second .so file
  3. Call pdoc3 mypkg

Additional info

kernc commented 1 year ago

Since you seem to have already investigated it some, care to propose a fix here: https://github.com/pdoc3/pdoc/blob/2cce30a9b55eeeddc1ed826c8a2ada53777c3eea/pdoc/__init__.py#L196-L235 or here: https://github.com/pdoc3/pdoc/blob/2cce30a9b55eeeddc1ed826c8a2ada53777c3eea/pdoc/__init__.py#L719-L768 ?

kernc commented 1 year ago

It looks like something in iter_modules(). inspect.getmodulename(file) recognizes those .so files as valid modules, so the function offers them for importing ... :thinking:

nennigb commented 1 year ago

yes, module_name = inspect.getmodulename(file) returns a module name in this function there is a list of all possible suffix given by importlib.machinery.all_suffixes()

['.py', '.pyc', '.cpython-39-x86_64-linux-gnu.so', '.abi3.so', '.so']

when processing the true extension, the suffix '.cpython-39-x86_64-linux-gnu.so' is detected and remove, but when we have the wrong python version, the '.so' is caught and only the '.so' is removed leading to a wrong module name...

Perhaps we may add a extra check to see if the module_name is valid. For instance here it contains .. Not sure if it should be always true at this stage although it is a loop module files... Another approach is to silent the path import error while keeping invalid python module import error...

I also made a mistake since an arbitrary abcd.so also raise an error (the file was originality in the wrong place :-( I edit my previous message).

kernc commented 1 year ago

I'd rather something more strict like:

if '.cpython-' in module_name:
    continue

PR welcome. I see with --skip-errors CLI switch, one can already turn this into a warning. https://github.com/pdoc3/pdoc/blob/2cce30a9b55eeeddc1ed826c8a2ada53777c3eea/pdoc/__init__.py#L753-L761

kernc commented 1 year ago

Can you show example of an error with abcd.so file?

nennigb commented 1 year ago
if '.cpython-' in module_name:
   continue

Agree, it will strongly limit the side effect, but I not sure if it will work on other platform. I will have a look.

If abcd.so is an empty file, the error is

    raise ImportError(f'Error importing {module!r}: {e.__class__.__name__}: {e}')
ImportError: Error importing 'pypolsys.abcd': ImportError: /home/xyz/Recherche/CODE/polsysplp/pypolsys/abcd.so: file too short

if abscd.so is a copy of a valid module extension with bad naming convention, the error is

    raise ImportError(f'Error importing {module!r}: {e.__class__.__name__}: {e}')
ImportError: Error importing 'pypolsys.abcd': ImportError: dynamic module does not define module export function (PyInit_abcd)

if the so file is a valid, non extension so file

    raise ImportError(f'Error importing {module!r}: {e.__class__.__name__}: {e}')
ImportError: Error importing 'pypolsys.libstdc++': ModuleNotFoundError: No module named 'pypolsys.libstdc++'

I will be happy to contribute.

ps : I miss the --skip-errors switch !