arineng / jcr

JSON Content Rules Draft
1 stars 2 forks source link

Combiner precedence #3

Closed codalogic closed 8 years ago

codalogic commented 9 years ago

Moved and enhanced from jcrvalidator...

What is the relative precedence of the choice and sequence combiners? The current text says "Sequence and choice combinations maybe mixed, with evaluation occurring in the order the rules are specified", but I'm not sure what that means. For example, does [ this | that, other | more ] correspond to [ (this | that), (other | more) ], [ this | (that, other) | more ], or [ ((this | that), other) | more ](left to right precedence?)?

Relax NG says that you have to be explicit about the groupings. So [ this | that, other | more ] would be illegal. The GCC C|C++ compiler generates a warning if you do the equivalent of that, even though it is not ambiguous in the language, so maybe people struggle with the precedence of them. It makes me think that the Relax NG approach has some merit and one should be required to explicitly group operators, i.e. do [ (this | that), (other | more) ], or [ this | (that, other) | more ].

What do you think?

anewton1998 commented 9 years ago

Well, its the latter. ( ( this | that ), other) | more)

I've never really found that knowing the precedent is right to left is hard to read. In fact, I think throwing in all those groupings makes it harder to read, because now I have to do parenthesis matching to know that I clearly understand it.

codalogic commented 9 years ago

I guess my difficulty is that I look at the comma and think that looks like the logical AND in conditional clauses like in C (and descendants) and look at the pipe and think that's like logical OR in conditional clauses, and start interpreting the statement like that. I imagine (hope!) I'm not alone in that, so perhaps we should put an example in the spec to show how it works in the case of JCR.

anewton1998 commented 9 years ago

In thinking of rules being evaluated, thinking of the comma as an AND and the pipe as an OR seems appropriate to me. You are right that we need to put a better example in the draft.

anewton1998 commented 9 years ago

Just for clarity, I'm open to changing things. It's just my opinion that forcing the use of groups leads to improved readability.

codalogic commented 9 years ago

Did you mean "forcing the use of groups leads to {improved}->{reduced} readability."?

anewton1998 commented 9 years ago

sorry. forcing the use of groups does not improve readability

codalogic commented 9 years ago

Which other languages use the left-to-right form of combining ANDs and ORs? If there are a number of conventions maybe forcing grouping isn't a bad thing? I agree Readability is important, but Avoiding Misunderstanding probably should trump it!

anewton1998 commented 9 years ago

Don't most languages like C, Java, Ruby, etc... have operator precedence from left to right? And isn't the concept of short circuiting predicated on that?

So when I see [ one , two | three ] I think I interpret this as:

  1. if rule one evaluates to true, continue
  2. if rule two evaluates to true, no need to go to rule three because the array rule evaluates to true.
  3. if rule two evaluates to false, look at rule three
  4. if rule three evaluates to true, then the array rule is true
  5. if rule three evaluates to false, the array rule is false
codalogic commented 9 years ago

Alas not. && has higher precedence than ||. It's like multiply and addition in maths (or math!).

I wasn't sure so I wrote the following C++ code to generate a Markdown table:

    std::cout << "\na|b|c|d|a &#124;&#124; b && c &#124;&#124; d | "
                    "a &#124;&#124; (b&&c) &#124;&#124; d | "
                    "((a &#124;&#124; b) && c) &#124;&#124; d\n";
    std::cout << "---|---|---|---|---|---|---\n";
    for( int a =0; a<2; ++a )
        for( int b =0; b<2; ++b )
            for( int c =0; c<2; ++c )
                for( int d =0; d<2; ++d )
                {
                    std::cout << a << " | " << b << " | " << c << " | " << d << " | " << 
                            (a||b && c||d) << " | " << 
                            (a || (b&&c) || d) << " | " << 
                            (((a||b) && c) || d) << "\n";
                }

That gives the following truth table:

a b c d a || b && c || d a || (b&&c) || d ((a || b) && c) || d
0 0 0 0 0 0 0
0 0 0 1 1 1 1
0 0 1 0 0 0 0
0 0 1 1 1 1 1
0 1 0 0 0 0 0
0 1 0 1 1 1 1
0 1 1 0 1 1 1
0 1 1 1 1 1 1
1 0 0 0 1 1 0
1 0 0 1 1 1 1
1 0 1 0 1 1 1
1 0 1 1 1 1 1
1 1 0 0 1 1 0
1 1 0 1 1 1 1
1 1 1 0 1 1 1
1 1 1 1 1 1 1

So you can see (a||b && c||d) equates to (a || (b&&c) || d) rather than (((a||b) && c) || d). (Although they are stunningly similar, which I guess could lead to some nasty bugs.)

Good question about short-circuiting. I wasn't sure how it does that, so I did the following:

    std::cout << "\na|b|c|d|order\n";
    std::cout << "---|---|---|---|---|---|---\n";
    for( int a =0; a<2; ++a )
        for( int b =0; b<2; ++b )
            for( int c =0; c<2; ++c )
                for( int d =0; d<2; ++d )
                {
                    std::cout << a << " | " << b << " | " << c << " | " << d << " | ";
                    (p(a,"a") || p(b,"b") && p(c,"c") || p(d,"d"));
                    std::cout << "\n";
                }

where p( v, s ) just prints s and returns v. It showed the evaluation order as:

a b c d order
0 0 0 0 a b d
0 0 0 1 a b d
0 0 1 0 a b d
0 0 1 1 a b d
0 1 0 0 a b c d
0 1 0 1 a b c d
0 1 1 0 a b c
0 1 1 1 a b c
1 0 0 0 a
1 0 0 1 a
1 0 1 0 a
1 0 1 1 a
1 1 0 0 a
1 1 0 1 a
1 1 1 0 a
1 1 1 1 a
anewton1998 commented 9 years ago

Hmm.... You are right, && has higher precedence in C, Java, JS, etc... Oddly, it has equal precedence in Ruby.

All that being said, this is really about matching and not strictly math operations. We are matching rules against content of arrays or objects, and if the matching is not from right to left it will be difficult to follow given that some rules (regular expressions, repetitions) can match more than one item/member. Having to group by precedence first will make predicting the matching difficult and result in more complicated implementations. With a left to right order and the need to have all items/member match one rule and one rule only, predictability is kept simple - even for unordered arrays. Left-to-right order also negates the precedence of groups, rendering them simply collections (as was originally intended).

codalogic commented 9 years ago

This has made me think... which is not good.

What if you had an object that allowed 3 members to be present, but you only wanted a maximum of 2. I was thinking we'd do it like:

rule { ( a, b ) | ( a, c ) | ( b, c ) }

A rule like the following also seems nasty (b is optional if c is present, otherwise it's mandatory):

rule { ( a, b ) | ( a, ?b, c ) }

what about arrays of the form:

rule [ ( "a", [ 10 : integer ] ) | ( "b", [ 15 : integer ] ) ]

I'm having trouble getting my head round it at the moment. Maybe we can park it for the time being, and then come up with some rules for the draft that will ensure that all implementations do the same, sensible thing.

anewton1998 commented 9 years ago

Alright... I'm getting your point. Let me think about this some, but I think you may be right about this.

anewton1998 commented 9 years ago

I have tests in my validator for all the examples above and some others:

arule [ ( :"a", [ 2 : integer ] ) | ( :"b", [ 4 : integer ] ) ]
matches
[ "a", [ 1, 2 ] ]

arule [ ( :"a", [ 2 : integer ] ) | ( :"b", [ 4 : integer ] ) ]
does not match
[ "a", [ 1, 2, 3, 4 ] ]

arule [ ( :"a", [ 2 : integer ] ) | ( :"b", [ 4 : integer ] ) ]
matches
[ "b", [ 1, 2, 3, 4 ] ]

arule [ ( :"a", [ 2 : integer ] ) | ( :"b", [ 4 : integer ] ) ]
does not match
[ "b", [ 1, 2 ] ],

arule [ :1, :2, :3 | :4 ]
matches
[ 1, 2, 3 ]

arule [ :1, :2, :3 | :4 ]
matches
[ 1, 2, 4 ]

arule [ :1, :2, :3 | :4 ]
does not match
[ 4 ]

arule [ ( :1, :2, :3 ) | :4 ]
matches
[ 1, 2, 3 ]

arule [ ( :1, :2, :3 ) | :4 ]
matches
[ 4 ]

arule [ ( :1, :2, :3 ) | :4 ]
does not match
[ 1, 2, 4 ]
anewton1998 commented 9 years ago

For ',' to have a higher precedence than '|' essentially means to group all the ',' operations. To do groups, especially in arrays, the validator has to take special care to unwind side affects if a group fails to evaluate to true. So its possible but a validator would certainly be helped out if that grouping were done at the syntax level.

Oddly enough, this is exactly what JSON Schema, XML Schema, and Relax NG do. Their syntax for combiners is a group. So I think we should leave the precedence of ',' and '|' to be equal and require people to use groups if they mean otherwise. The bonus here is that JCR is little more flexible than those others.

codalogic commented 9 years ago

I'm still struggling here, especially for the cases where we flip between , and | combiners more than once. For example, can you show me what happens for the following cases:

Given:

arule [ :1, :2, :3 | :4, :5 ]

which of the following match or not:

[ 1, 2, 3 ]
[ 1, 2, 3, 5 ]
[ 1, 2, 4, 5 ]

Given:

arule [ :1, :2, :3 | :4 | :5 ]

which of the following match or not:

[ 1, 2, 3 ]
[ 1, 2, 4 ]
[ 1, 2, 5 ]
[ 1, 2, 3, 5 ]

Given:

arule [ :1, :2, :3 | :4 , :5 | :6 ]

which of the following match or not:

[ 1, 2, 3, 5 ]
[ 1, 2, 4, 6 ]
[ 1, 2, 4, 5 ]
+ any other combinations you think would help me.
codalogic commented 9 years ago

Also, Given:

arule [ :1, :2, :3 | :4 , :5, :6  | :7 ]

which of the following match or not:

[ 1, 2, 3, 5, 6 ]
[ 1, 2, 3, 5, 7 ]
[ 1, 2, 4, 5, 6 ]
[ 1, 2, 4, 5, 7 ]

Thanks

anewton1998 commented 9 years ago

Here are the results.

  it 'should demonstrate OR and AND logic 1' do
    tree = JCR.parse( 'arule [ :1, :2, :3 | :4, :5 ]' )
    mapping = JCR.map_rule_names( tree )
    JCR.check_rule_target_names( tree, mapping )
    expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 3 ], mapping ).success ).to be_truthy
    expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 3, 5 ], mapping ).success ).to be_falsey
    expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 4, 5 ], mapping ).success ).to be_truthy
  end

  it 'should demonstrate OR and AND logic 2' do
    tree = JCR.parse( 'arule [ :1, :2, :3 | :4 | :5 ]' )
    mapping = JCR.map_rule_names( tree )
    JCR.check_rule_target_names( tree, mapping )
    expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 3 ], mapping ).success ).to be_truthy
    expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 4 ], mapping ).success ).to be_truthy
    expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 5 ], mapping ).success ).to be_truthy
    expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 3, 5 ], mapping ).success ).to be_falsey
  end

  it 'should demonstrate OR and AND logic 3' do
    tree = JCR.parse( 'arule [ :1, :2, :3 | :4 , :5 | :6 ]' )
    mapping = JCR.map_rule_names( tree )
    JCR.check_rule_target_names( tree, mapping )
    expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 3, 5 ], mapping ).success ).to be_falsey
    expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 4, 6 ], mapping ).success ).to be_truthy
    expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 4, 5 ], mapping ).success ).to be_truthy
  end

  it 'should demonstrate OR and AND logic 3' do
    tree = JCR.parse( 'arule [ :1, :2, :3 | :4 , :5, :6  | :7 ]' )
    mapping = JCR.map_rule_names( tree )
    JCR.check_rule_target_names( tree, mapping )
    expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 3, 5, 6 ], mapping ).success ).to be_falsey
    expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 3, 5, 7 ], mapping ).success ).to be_falsey
    expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 4, 5, 6 ], mapping ).success ).to be_truthy
    expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 4, 5, 7 ], mapping ).success ).to be_truthy
  end
anewton1998 commented 9 years ago

This is making me think I have a bug in my code.

anewton1998 commented 9 years ago

Ok. Yes, there was a bug and it was doing thing I didn't expect. Now it is working the way I predict.

  it 'should demonstrate OR and AND logic 1' do
    tree = JCR.parse( 'arule [ :1, :2, :3 | :4, :5 ]' )
    mapping = JCR.map_rule_names( tree )
    JCR.check_rule_target_names( tree, mapping )
    expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 3 ], mapping ).success ).to be_falsey
    expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 3, 5 ], mapping ).success ).to be_truthy
    expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 4, 5 ], mapping ).success ).to be_truthy
  end

  it 'should demonstrate OR and AND logic 2' do
    tree = JCR.parse( 'arule [ :1, :2, :3 | :4 | :5 ]' )
    mapping = JCR.map_rule_names( tree )
    JCR.check_rule_target_names( tree, mapping )
    expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 3 ], mapping ).success ).to be_truthy
    expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 4 ], mapping ).success ).to be_truthy
    expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 5 ], mapping ).success ).to be_truthy
    expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 3, 5 ], mapping ).success ).to be_falsey
  end

  it 'should demonstrate OR and AND logic 3' do
    tree = JCR.parse( 'arule [ :1, :2, :3 | :4 , :5 | :6 ]' )
    mapping = JCR.map_rule_names( tree )
    JCR.check_rule_target_names( tree, mapping )
    expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 3, 5 ], mapping ).success ).to be_truthy
    expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 4, 6 ], mapping ).success ).to be_truthy
    expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 4, 5 ], mapping ).success ).to be_truthy
    expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 3, 6 ], mapping ).success ).to be_truthy
  end

  it 'should demonstrate OR and AND logic 4' do
    tree = JCR.parse( 'arule [ :1, :2, :3 | :4 , :5, :6  | :7 ]' )
    mapping = JCR.map_rule_names( tree )
    JCR.check_rule_target_names( tree, mapping )
    expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 3, 5, 6 ], mapping ).success ).to be_truthy
    expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 3, 5, 7 ], mapping ).success ).to be_truthy
    expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 4, 5, 6 ], mapping ).success ).to be_truthy
    expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 4, 5, 7 ], mapping ).success ).to be_truthy
  end
anewton1998 commented 9 years ago

And one more example, taken from your truth table above.

  it 'should demonstrate OR and AND logic 5' do
    tree = JCR.parse( 'arule [ :1 | :2 , :3 | :4 ]' )
    mapping = JCR.map_rule_names( tree )
    JCR.check_rule_target_names( tree, mapping )
    expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 3 ], mapping ).success ).to be_truthy
    expect( JCR.evaluate_rule( tree[0], tree[0], [ 2, 4 ], mapping ).success ).to be_truthy
    expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 4 ], mapping ).success ).to be_truthy
    expect( JCR.evaluate_rule( tree[0], tree[0], [ 2, 3 ], mapping ).success ).to be_truthy
  end
codalogic commented 9 years ago

If I group up the rules so the precedence is explicit, I end up doing this:

tree = JCR.parse( 'arule [ :1, :2, (:3 | :4), :5 ]' )
expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 3 ], mapping ).success ).to be_falsey
expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 3, 5 ], mapping ).success ).to be_truthy
expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 4, 5 ], mapping ).success ).to be_truthy

tree = JCR.parse( 'arule [ :1, :2, (:3 | :4 | :5) ]' )
expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 3 ], mapping ).success ).to be_truthy
expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 4 ], mapping ).success ).to be_truthy
expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 5 ], mapping ).success ).to be_truthy
expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 3, 5 ], mapping ).success ).to be_falsey

tree = JCR.parse( 'arule [ :1, :2, (:3 | :4), (:5 | :6) ]' )
expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 3, 5 ], mapping ).success ).to be_truthy
expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 4, 6 ], mapping ).success ).to be_truthy
expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 4, 5 ], mapping ).success ).to be_truthy
expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 3, 6 ], mapping ).success ).to be_truthy

tree = JCR.parse( 'arule [ :1, :2, (:3 | :4) , :5, (:6  | :7) ]' )
expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 3, 5, 6 ], mapping ).success ).to be_truthy
expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 3, 5, 7 ], mapping ).success ).to be_truthy
expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 4, 5, 6 ], mapping ).success ).to be_truthy
expect( JCR.evaluate_rule( tree[0], tree[0], [ 1, 2, 4, 5, 7 ], mapping ).success ).to be_truthy

From these examples it looks like the | operator has precedence over the , operator. Is that how you see it, or perhaps there needs to be some more examples that show where that assumption breaks. (I'm afraid my preconceptions are not allowing me to see past one having precedence over the other so I'm finding hard to come up with examples that break it.)

codalogic commented 9 years ago

This is how I’m thinking it could work at the moment:

Given the rule:

arule [ 3:8, 2:7, 1*5:6 | :5, :4 ]

1 - Map each inner value rule to an abstract symbol, so :8 is represented by A1, :7 is represented by A2 and so on.

2 - Build a regular expression that represents the rule using the abstract symbols. In the above case we’d get:

(A1){3}(A2){2}(A3){1,5}|(A4)(A5)

Which when anchored at both ends and put into a regular expression looks like:

/^((A1){3}(A2){2}(A3){1,5}|(A4)(A5))$/

3 - When parsing a JSON instance, each input value is mapped to its abstract symbol, for example, for:

[ 8, 8, 8, 7, 7, 6, 6 ]

The representative abstract sequence would be:

A1A1A1A2A2A3A3

4 - Then test that input sequence against the regular expression made in step 2. If it’s valid then accept the array.

For unordered arrays the abstract symbols need to be sorted according to the order they are given in the rule on a first match basis (so [ 7, 8, 6, 8, 7, 8, 6 ] -> A2A1A3A1A2A1A3 -> A1A1A1A2A2A3A3) (Actually I think there is more reordering required for the unordered case, but I’ll leave that for later!)

How feasible does that look? Or have I lost the plot!

codalogic commented 9 years ago

I've been thinking some more about the regular expression mapping approach, and while I think it might work well for ordered arrays, it won't extend well to unordered arrays and objects. Shame, as it seemed to off-load the complicated bits to readily available code.

anewton1998 commented 9 years ago

Answering a couple of questions you asked in reverse order:

1) Regarding the feasibility of your implementation, I agree that it would present a problem with unordered arrays. However, its a method I had not thought about. Where it might have trouble is with arrays within array and objects within arrays.

2) Regarding the precedence order, yes it does appear that | has higher precedence than , but my code is really not doing that. It simple gives them equal treatment by processing the AND or OR condition before each rule is evaluated, and the rules are evaluated in the order they appear.

3) Getting back to your original question sometime back, should we force users to use groups and be explicit? Given all the confusion you and I are going through, that's probably not a bad idea. Do we think we can enforce that in the grammar?

codalogic commented 9 years ago

I'm thinking being explicit is the way to go too. Reading around the subject in other similar grammars I get the impression that we're not the only ones that have had different interpretations of this. For example, I re-read RFC-5234 and it says:

3.5.  Sequence Group:  (Rule1 Rule2)

   Elements enclosed in parentheses are treated as a single element,
   whose contents are strictly ordered.  Thus,

         elem (foo / bar) blat

   matches (elem foo blat) or (elem bar blat), and

         elem foo / bar blat

   matches (elem foo) or (bar blat).

   NOTE:

      It is strongly advised that grouping notation be used, rather than
      relying on the proper reading of "bare" alternations, when
      alternatives consist of multiple rule names or literals.

   Hence, it is recommended that the following form be used:

        (elem foo) / (bar blat)

   It will avoid misinterpretation by casual readers.

and:

3.10.  Operator Precedence

   The various mechanisms described above have the following precedence,
   from highest (binding tightest) at the top, to lowest (loosest) at
   the bottom:

      Rule name, prose-val, Terminal value

      Comment

      Value range

      Repetition

      Grouping, Optional

      Concatenation

      Alternative

   Use of the alternative operator, freely mixed with concatenations,
   can be confusing.

      Again, it is recommended that the grouping operator be used to
      make explicit concatenation groups.

So they mentioned twice that it's a good idea to do explicit grouping, despite it not being required.

By its nature, most people will only be casual users of JCR, so we should make it as foolproof as possible, avoiding any gotchas where we can. The extra brackets will only come into play in the more complex scenarios anyway, so hopefully the noise they introduce will be minimal.

As to whether it can be worked into the ABNF, I'm not sure. I could have a quick look at it, but I think it could just as validly be handled narratively in the same way that all group members must be consistently member rules or value rules.

codalogic commented 9 years ago

FWIW, regarding your question:

1) Regarding the feasibility of your implementation, I agree that it would present a problem with     
unordered arrays. However, its a method I had not thought about. Where it might have trouble is 
with arrays within array and objects within arrays.

I was envisioning that any child objects and arrays would be recursively validated, and if they were valid would be represented by their corresponding abstract symbol. So if you had JCR of:

[ *:int, [ "Fred", *:string ], [ "Bill", *:string ] ]

[ "Fred", *:string ] would have it's own abstract symbol (say A2), and [ "Bill", *:string ] would have its own abstract symbol (say A3). The regular expression sequence would then be:

/^((A1)*(A2)(A3))$/

So if you got the JSON:

[ 12, 45, 18, [ "Fred", "local", "remote" ], [ "Bill", "upper", "lower" ] ]

the abstract symbol sequence for that would be:

A1A1A1A2A3

As you can guess, I'm still thinking about it!

codalogic commented 9 years ago

I've had a go at enforcing the grouping in the ABNF. It's at: https://github.com/codalogic/jcrvalidator/blob/explicit-precedence/lib/jcr/parser.rb . I haven't gone through fixing the tests yet. The revised rules look like:

rule(:object_items) { object_item >> (( spcCmnt? >> sequence_combiner >> spcCmnt? >> object_item ).repeat(1) |
                                      ( spcCmnt? >> choice_combiner >> spcCmnt? >> object_item ).repeat(1) ).maybe }
    #! object_items = object_item (*( sequence_combiner object_item ) /
    #!                             *( choice_combiner object_item ) )

and:

rule(:array_items)  { array_item >> (( spcCmnt? >> sequence_combiner >> spcCmnt? >> array_item ).repeat(1) |
                                     ( spcCmnt? >> choice_combiner >> spcCmnt? >> array_item ).repeat(1) ).maybe }
    #! array_items = array_item (*( sequence_combiner array_item ) /
    #!                           *( choice_combiner array_item ) )

and:

rule(:group_items)  { group_item >> (( spcCmnt? >> sequence_combiner >> spcCmnt? >> group_item ).repeat(1) |
                                     ( spcCmnt? >> choice_combiner >> spcCmnt? >> group_item ).repeat(1) ).maybe }
    #! group_items = group_item (*( sequence_combiner group_item ) /
    #!                           *( choice_combiner group_item ) )

The various repeat(1)s and maybes may look at bit odd, but that seemed to be what I had to do to get the parser to work.

If you're happy with that I could look further at fixing up the tests.

anewton1998 commented 9 years ago

I can fix the tests. Let me pull this code and work on it. Thanks.

anewton1998 commented 8 years ago

done.