pdoc3 / pdoc

:snake: :arrow_right: :scroll: Auto-generate API documentation for Python projects
https://pdoc3.github.io/pdoc/
GNU Affero General Public License v3.0
1.11k stars 143 forks source link

BUG: Strengthen signature detection for pybind generated modules #435

Closed wabscale closed 4 months ago

wabscale commented 6 months ago

I recently ran into a rather complex issue with pdoc and pybind modules. The signature detection peice of code can be very easily broken under specific conditions.

Problem

In the signature detection piece of code, the signature of functions without easily detectable functions are evaluated. For pybind, this is always run to my knowledge. If you then have a default argument in this function, it will be inserted into the exec along with the rest of the signature.

https://github.com/wabscale/pdoc/blob/3ecfbcfb658c5be9ee6ab572b63db2cb5e1c29e1/pdoc/init.py#L1569

try:
    exec(f'def {string}: pass', _globals, _locals)
except SyntaxError:
    # ...

What I have found is that if you have a class that has __repr__ defined to return a string version of the class, that is then the default argument to a function, that __repr__ value will be inserted into the exec. What then can become a problem is if it causes an error in the exec, and that error is not SyntaxError.

A reasonable example of this is if we have our own DateTime class in the pybind c++ that has its __repr__ return the string formatted value for that DateTime. If you then have a function that uses a DateTime default argument, then the formatted string will be inserted into that eval. Lets say the string version of the DateTime is "20231222-165422-EST". What gets inserted into the exec function could look like:

exec('def foo(self: A, dt: DateTime = 20231222-165422-EST) -> None: pass')

This exec will not raise a SyntaxError but a NameError:

Cell In[2], line 1
----> 1 exec('def foo(self: A, dt: DateTime = 20231222-165422-EST) -> None: pass')

File <string>:1

NameError: name 'EST' is not defined

Solution

The solution to make this work as expected is to loosen the exception handling for the exec call. I would suggest:

try:
    exec(f'def {string}: pass', _globals, _locals)
except Exception:
    continue

Reproducing my claims

As this requires building a pybind module, there is a bit more than just a snippet of python code to reproduce the issue. I have made a seperate repo with all the files ready to go: https://github.com/wabscale/pdoc3-bug-poc

kernc commented 4 months ago

The solution to make this work as expected is to loosen the exception handling for the exec call.

What do you mean when you say expected? With NameError caught, the routine still fails through to the next pattern / no match for those pybind11 module exports.

Which, I'll agree, is way better than raising.

Thanks!