Closed magniff closed 8 years ago
@vlasovskikh validate please)
Why do you need to define a forward declaration parser using another forward declaration, not a real parser?
@vlasovskikh
It actually happens when your grammar has loops, like this one. Have a look at location
and expr
non terminals - they depend on each other. It is not a big deal if you are implementing parser as a single class or module, but it kinda IS if your parser's components decomposed and spread all over the project i.e. multiple packages. If so, sooner or later you will get a module dependency loop. Surely language designer (MIT professor in this case) could rework grammar to avoid stuff like that, but sometimes (and it actually happens) grammar rules are not known at compile time - say if you are using some import hooks magic or meta programming. It turns out that if you split parser's declaration from parser's implementation it would help. Lets say for each non terminal we have a separate package, just add declaration module in it, that would contain parser declaration like this p_location = funcparserlib.parser.forward_decl()
. Taking this approach we would be able to reuse this p_location
object without import looping, have a look at sample project structure:
parsers
__init__.py
--- location
--- parser.py # defines p_location and imports expression.declaration.p_expression
--- declaration.py # declares p_location
--- expression
--- parser.py # defines p_expression and imports location.declaration.p_location
--- declaration.py # declares p_expression
Now to define p_location
at location.parser
we can do from .declaration import p_location
. To use declaration of p_location
in expression.parser
just from ..location.declaration import p_location
. And the last thing - to make sure that for the external user parsers
package would be built correctly, we should add to parsers.__init__
something like
from .location.parser import p_location
from .expression.parser import p_expression
Code above actually guaranties invocation of 'definition code' for each parser.
Having recursive definitions spread among several modules is usually a bad practice. I would rather keep the forward declaration parser in the library simple.
Consider monkey-patching the forward declaration parser or switching to a single module structure for your grammar.
The strongest point of recursive-descent parsing is that it allows composability unattainable with classic parser generators, and being able to split anything into multiple modules once your single module becomes large enough is definitely a huge plus. Not having non-eager forward decls prevents that in some cases, such as the one described by @magniff above. Are there any cons to this approach, other than possible weird AttributeError
s popping out in strange places?
The problem is straight forward as hell - if you call
parser0.define(parser1)
and parser1 haven't been defined yet, you just copy its dummy run method realization, I think this is not the way it meant to be played. Following little fix adds more laziness to evaluation cascade.