Open kevinvanleer opened 1 year ago
The key thing to remember is that the structure of the output mirrors the structure of the grammar. In your first grammar you have these rules:
query <- (expression @" "? expr_op? @" "?)+
expression <- negation? @" "? lvalue @" " operator @" " rvalue
You try to parse this string:
'not distance is 20 or not date is today and not date is tomorrow'
And you get this sequence of elements from the root node of the tree:
not distance is 20 or
not date is today and
not date is tomorrow
This mirrors the structure of the query
rule whose expression is (expression @" "? expr_op? @" "?)+
-- a repetition of one or more instances of the
expression expression @" "? expr_op? @" "?
. So each element of a match for
query
will contain both an expression
and a expr_op
.
In order to get this output:
not distance is 20
or
not date is today
and
not date is tomorrow
You would need a grammar where the query
rule was a repetion of either
expression
or expr_op
, i.e. (expression / expr_op)+
(with spaces added
where necessary). This is probably not what you want, because it means the
grammar would match nonsense queries like or or and or
.
If I change the grammar such that all nodes are required I get the expected output.
I'm not sure what you mean by "required" here, and the grammar presented after this paragraph is identical to the first one.
Then you give another version of the query
rule:
query <- expression (@" " expr_op @" " expression)*
This rule is a sequence of two elements: it requires one expression
followed
by zero or more instances of (" " expr_op " " expression)
. Here is the output
showing which part of the rule is responsible for each part:
not distance is 20 expression
or not date is today and not date is tomorrow (@" " expr_op @" " expression)*
If you look inside the second element here you'll see multiple elements for each
match of the (...)*
expression:
or not date is today
and not date is tomorrow
This rule looks to me like the correct grammar, as you seem to want your query
language to allow one or more expressions
joined together by and
/or
operators. Every other version of query
you've shown will also match other
strings that won't make any sense, e.g. allowing the string to start or end with
an operator, or be composed entirely of operators.
First of all, thanks for developing this great library! I am prototyping a search grammar in javascript with the hopes of porting it to Java for uses in an android app.
I am building a search grammar that takes the form
expression (operator expression)*
. If I build the grammar using optional repetition, the optional nodes are missing from the output.Example:
peg file
test app.js
If I attempt to parse the following string:
I get the following output using the URL parser sample code:
I expected to get:
If I change the grammar such that all nodes are required I get the expected output.
If I print the entire tree I see that all required nodes are in the tree, but optional nodes are not. Here is an excerpt from an expression node:
You can see that all nodes other than the optional
negation
node are listed. The negation node is parsed as a member of the elements list. This is also true for theexpr_op
nodes. They are only included in the tree if they are required nodes.If I change the grammar to use a different form of repetition I get a third output:
output:
However the
expr_op
nodes are included as nodes in the second blob.It seems like
and
should produce the same output and that the
expr_op
nodes andexpression
nodes should be at the same level in the tree. Your help is greatly appreciated!EDIT: I was hoping the tree object would look something like this: