Closed liquidev closed 2 years ago
I do like the proposal and how clean the use is, just some random thoughts as they bubble up:
The proposal is basically the same as the "external stack method", but manages the stack for the user. I think the cognitive load on the user is about the same, although it saves some administrative manual management of the stack
I fear it will complicate NPegs implementation for this specific use case. NPeg is already "double generic" under the hood, as it already keeps track of the subject type and userdata type; adding this feature would also need an AST type (or reuse/abuse the userdata type and lose flexibility)
I'll justify the second argument by saying that parsing structured grammars into AST is a very common use case when building scripting languages (which is a thing at least I do), and also parsing data formats (such as JSON). As for the first argument, I'd say that this:
call <- A(ident) * '(' * A(expr * *(',' * expr)) * ')':
build Node(kind: nkCall, fn: node(1), arg: nodes(2))
is much more cleaner than the equivalent manual stack method, which is what I use in cflang, putting aside the fact that cflang's grammar for calls is a bit different than your usual fn(a, b, c)
:
callArgs <- expr * *([tkComma] * expr)
call <- (callOp * ?callArgs) ^ 1:
var args: seq[Node]
while not (p.top.kind == nkCall and not p.top.done):
args.add p.pop
reverse args
p.top.sons.add(args)
p.top.done = true
Compared to my proposed solution, you're inducing extra cognitive load on the user to try and understand where the stack elements come from, and what is done to them. In my version, it's immediately obvious that the AST captures are used.
I hear you, I do. This is how I do the above, though:
https://github.com/zevv/mathpipe/blob/master/mp.nim#L60
You'll have to agree that does not look too bad.
I personally do not like the placeholder idea, it feels quite flaky and easy to mess up like I did.
While I personally think the current functionality is fine as-is, I could see someone building an "extension" module that builds on top of NPeg to provide AST generation capabilities.
Closing as part of the spring cleaning, feel free to reopen if needed.
Currently, AST building with NPeg is quite difficult and bug-prone. You have to maintain a stack of nodes and use a bunch of relatively inefficient workarounds to build a nice, inspectable, traversable abstract syntax tree.
My proposal is to allow the user to specify their own Node type, and add primitives for building abstract syntax trees. Here's my idea for the API:
Usage of
node(n)
on a sequence capture would result in either an error or the first node being returned. Usage ofnodes(n)
on a singular capture would result in an error or a sequence with only that node being returned.