Instagram / LibCST

A concrete syntax tree parser and serializer library for Python that preserves many aspects of Python's abstract syntax tree
https://libcst.readthedocs.io/
Other
1.56k stars 192 forks source link

Recursion depth exceeded when applying MetadataWrapper on a very long concatenated string #1178

Open sirosen opened 4 months ago

sirosen commented 4 months ago

Running a simple roundtripper on every python file in the cpython repo's Lib dir found this interesting case. Aside: this is the only new case I found in my follow-up investigation from #1095.

Here's a permalink to the current version of pydoc_data/topics.py

Attempting to roundtrip this file in my scratch env produces a huge trace of the form:

long trace (with sections snipped) ``` Traceback (most recent call last): File "/home/sirosen/_scratch/.venv/lib/python3.11/site-packages/libcst/_nodes/base.py", line 358, in deep_clone cloned_fields[key] = tuple(_clone(v) for v in val) ^^^^^^^^^^^^^^^^^^^^^^^^ TypeError: 'Dict' object is not iterable During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/sirosen/_scratch/.venv/lib/python3.11/site-packages/libcst/_nodes/base.py", line 358, in deep_clone cloned_fields[key] = tuple(_clone(v) for v in val) ^^^^^^^^^^^^^^^^^^^^^^^^ TypeError: 'ConcatenatedString' object is not iterable During handling of the above exception, another exception occurred: ... repeats many times ... During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/sirosen/_scratch/.venv/lib/python3.11/site-packages/libcst/_nodes/base.py", line 358, in deep_clone cloned_fields[key] = tuple(_clone(v) for v in val) ^^^^^^^^^^^^^^^^^^^^^^^^ TypeError: 'ConcatenatedString' object is not iterable During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/sirosen/_scratch/.venv/lib/python3.11/site-packages/libcst/_nodes/base.py", line 358, in deep_clone cloned_fields[key] = tuple(_clone(v) for v in val) ^^^^^^^^^^^^^^^^^^^^^^^^ TypeError: 'SimpleString' object is not iterable During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/sirosen/_scratch/rt.py", line 21, in _roundtrip_data(content) File "/home/sirosen/_scratch/rt.py", line 12, in _roundtrip_data wrapped_tree = libcst.MetadataWrapper(raw_tree) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sirosen/_scratch/.venv/lib/python3.11/site-packages/libcst/metadata/wrapper.py", line 146, in __init__ module = module.deep_clone() ^^^^^^^^^^^^^^^^^^^ File "/home/sirosen/_scratch/.venv/lib/python3.11/site-packages/libcst/_nodes/base.py", line 358, in deep_clone cloned_fields[key] = tuple(_clone(v) for v in val) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sirosen/_scratch/.venv/lib/python3.11/site-packages/libcst/_nodes/base.py", line 358, in cloned_fields[key] = tuple(_clone(v) for v in val) ^^^^^^^^^ File "/home/sirosen/_scratch/.venv/lib/python3.11/site-packages/libcst/_nodes/base.py", line 105, in _clone return val.deep_clone() ^^^^^^^^^^^^^^^^ File "/home/sirosen/_scratch/.venv/lib/python3.11/site-packages/libcst/_nodes/base.py", line 358, in deep_clone cloned_fields[key] = tuple(_clone(v) for v in val) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ... repeats many times ... File "/home/sirosen/_scratch/.venv/lib/python3.11/site-packages/libcst/_nodes/base.py", line 362, in deep_clone return type(self)(**cloned_fields) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "", line 6, in __init__ File "/home/sirosen/_scratch/.venv/lib/python3.11/site-packages/libcst/_nodes/base.py", line 117, in __post_init__ self._validate() File "/home/sirosen/_scratch/.venv/lib/python3.11/site-packages/libcst/_nodes/expression.py", line 592, in _validate super(SimpleString, self)._validate() RecursionError: maximum recursion depth exceeded ```

In case it's relevant, my no-op transformer is this body of code:

rt.py ```python import sys import libcst class NoopTransformer(libcst.CSTTransformer): pass def _roundtrip_data(content: bytes) -> bytes: raw_tree = libcst.parse_module(content) wrapped_tree = libcst.MetadataWrapper(raw_tree) tree = wrapped_tree.visit(NoopTransformer()) return tree.code.encode(tree.encoding) fn = sys.argv[1] with open(fn, "rb") as fp: content = fp.read() _roundtrip_data(content) ```

My guess would be that this is hard to solve because it probably requires unwinding a complex recursive construction into something non-recursive. Given that cpython can support this, I thought it was interesting enough to be worth reporting.

kiri11 commented 3 months ago

Maybe just increase the recursion limit? Is it currently the default 1000? You can check with sys.getrecursionlimit() and increase with sys.setrecursionlimit

sirosen commented 3 months ago

That probably works if my goal is to make this work by hook or by crook. I opened this primarily because it was an interesting discovery I felt was worth sharing with the libcst maintainers. It's not really blocking me from any work. (I would not be bothered if libcst declares this a non-issue and closes.)