neogeny / TatSu

竜 TatSu generates Python parsers from grammars in a variation of EBNF
https://tatsu.readthedocs.io/
Other
408 stars 48 forks source link

Unusual use of `__new__()` in `NodeWalker` #266

Closed dnicolodi closed 2 years ago

dnicolodi commented 2 years ago

Hunting for old-style super() calls I bumped into the __new__() method of the NodeWalker class:

class NodeWalker:
    def __new__(cls, *args, **kwargs):
        cls._walker_cache = {}
        return super(NodeWalker, cls).__new__(cls)

As the class does not inherit from other classes the use of super() seems redundant here as it is equivalent to simply calling cls() but this is not what is unusual. What is unusual is modifying the class definition in the __new__() method. This means that all NodeWalker instances share the same _walker_cache dictionary and that this dictionary in reset every time a new NodeWalker instance is created. I cannot find a reason why this could be desirable.

Shouldn't this __new__() method be turned into a __init__() method?

apalala commented 2 years ago

I don't remember when or why this code was implemented. It seems that the intent can only be handled by a metaclass.

dnicolodi commented 2 years ago

As far as I know, NodeWalker classes are not created in large number. I think the most common case is to have one NodeWalker sublclass instance per parser instance. If more than one NodeWalker subclasse instances are created per parser, they are almost certainly different subclasses. Thus I don't see how sharing a cache between different instances of NodeWalker is useful.

Should we make this __new__() into an __init__()?

Do you have some project that you can use to benchmark if this causes a performance regression?

dnicolodi commented 2 years ago

Thinking a bit more about this, it may be that this implementation was an attempt to add instance initialization without requiting subclasses to have to call super().__init__() in their initialization. However, this results in somehow bizarre behavior, thus I think changing the implementation is worth the minuscule compatibility breakage. The breakage manifests with an obvious error and the fix is straightforward.