lark-parser / lark

Lark is a parsing toolkit for Python, built with a focus on ergonomics, performance and modularity.
MIT License
4.77k stars 406 forks source link

Documentation: should not advocate a coding style that creates reference cycles #1217

Closed fischbacher closed 1 year ago

fischbacher commented 1 year ago

Describe the bug

Right here in the documentation:

https://lark-parser.readthedocs.io/en/latest/recipes.html#keeping-track-of-parents-when-visiting

following this example readily creates situations where for some nodes N, we have [c.parent is N for c in N.children if isinstance(c, Tree)]. That's a reference cycle.

Existence of the gc module nonwithstanding, it is generally not advisable to create such situations in a language where relevant implementations (here, CPython) use reference counting as a substitute for a GC.

It might be more advisable here to instead recommend using a class that owns the parsed tree, plus extra data alongside it, such as a dictionary node_by_id = {id(node): node for node in {... all the nodes ...}} alongside a dictionary {id(node): id(parent) for parent node_by_id.values() for node in parent.children}.

erezsh commented 1 year ago

Updated docs.

GenhaoLi commented 1 year ago

Updated docs.

Hi @erezsh , I found the updated doc a bit confusing.

subtree.parent = proxy(tree)

What is the function proxy here? I think the doc should mention the actual reason, as stated in this issue, and implement the function proxy or import it from some library.

Thanks!

erezsh commented 1 year ago

Hi @Gytobp , proxy() is a builtin python function, available using from weakref import proxy.