Closed dabrahams closed 1 year ago
Actually I don't think you need the whole concept of derivation paths for this; if it won't disorient you too much I might just take it out.
Please leave the material on derivation paths in for the time being.
Sent from ProtonMail mobile
-------- Original Message -------- On Dec 3, 2022, 5:15 PM, Dave Abrahams wrote:
Actually I don't think you need the whole concept of derivation paths for this; if it won't disorient you too much I might just take it out.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
I'll need a while to read and think this over.
OK. I have an example I've worked through if you think that would be helpful.
Can't hurt. I'd be happy to see it. It may be easier to explain in terms of an example rather than math notation.
Grammar:
A ::= 'w' 'x' B | 'w'
B ::= C
C ::= 'y' 'z' A
where A is the start symbol. Input is "wxyzwxyzw"
The chart looks like this. The numbers and square brackets indicate origins and the earleme of the dot. Curly brackets surround a set of pre-dot origins, which I use to reconstruct derivation paths. Asterixes mark Leo items. The final column is a code I invented that describes how each item arose during recognition; you can ignore that or if you want I can explain it.
---------- 0 ----------
0: [0 A ::= [0 •'w' 'x' B P
1: [0 A ::= [0 •'w' P
---------- 1 ---------- 'w'
2: [0 A ::= [0 'w' 1] •'x' B <0['w']
3: [0 A 1] ::= [0 'w' 1] •
---------- 2 ---------- 'x'
4: [0 A 2] ::= [0 'w' 'x' B 2] •B* L5
5: [0 A ::= [0 'w' {1} 'x' 2] •B <2['x']
6: [0 A 2] ::= [0 'w' 'x' B 2] •C* L7p4
7: [2 B ::= [2 •C P5
8: [2 C ::= [2 •'y' 'z' A P7
---------- 3 ---------- 'y'
9: [2 C ::= [2 'y' 3] •'z' A <8['y']
---------- 4 ---------- 'z'
10: [0 A 4] ::= [0 'w' 'x' B 4] •A* L11p6
11: [2 C ::= [2 'y' {3} 'z' 4] •A <9['z']
12: [4 A ::= [4 •'w' 'x' B P11
13: [4 A ::= [4 •'w' P11
---------- 5 ---------- 'w'
14: [4 A ::= [4 'w' 5] •'x' B <12['w']
15: [0 A 5] ::= [0 'w' 'x' {2} B 5] • <10[16]
16: [4 A 5] ::= [4 'w' 5] • <13['w']
---------- 6 ---------- 'x'
17: [0 A 6] ::= [0 'w' 'x' B 6] •B* L18p10
18: [4 A ::= [4 'w' {5} 'x' 6] •B <14['x']
19: [0 A 6] ::= [0 'w' 'x' B 6] •C* L20p17
20: [6 B ::= [6 •C P18
21: [6 C ::= [6 •'y' 'z' A P20
---------- 7 ---------- 'y'
22: [6 C ::= [6 'y' 7] •'z' A <21['y']
---------- 8 ---------- 'z'
23: [0 A 8] ::= [0 'w' 'x' B 8] •A* L24p19
24: [6 C ::= [6 'y' {7} 'z' 8] •A <22['z']
25: [8 A ::= [8 •'w' 'x' B P24
26: [8 A ::= [8 •'w' P24
---------- 9 ---------- 'w'
27: [8 A ::= [8 'w' 9] •'x' B <25['w']
28: [0 A 9] ::= [0 'w' 'x' {2} B 9] • <23[29]
29: [8 A 9] ::= [8 'w' 9] • <26['w']
I also represented the chart another way, showing only the items relevant to a complete recognition. The lines with rules in them should be pretty self-explanatory. The numbers below each of those lines identify the items in the chart above that are relevant to recognizing the LHS of each rule line, i.e. they are the rule's derivation path (and possibly completion).
Set: 0 1 2 3 4 5 6 7 8 9
A ::= 'w' 'x' B....................................
0 2 5 28
B ::= C....................................
7
C ::= 'y' 'z' A........................
8 9 11
A ::= 'w' 'x' B............
15 14 18
B ::= C............
20
C ::= 'y' 'z' A
21 22 24
A ::= 'w'
26 29
As your book describes, only the top and bottom of the parse have completions (items 28 and 29).
When I'm analyzing the top line of the parse, I'll fail to find a completion for the B. I don't really need the completion items, but I need to know the rules involved and the sequence of numbers below each one.
Because there is no completion of a B with origin 2 in Earley set 9, I know item 28, which completes the first line, is the product of a Leo item. So let's analyze that:
You can see how this process continues upwards until it arrives back at the top line, discovering all the derivation paths.
I was able to figure out the code for the causation of the items, but if you could document it for anyone visiting or revisiting the thread, I think that would be helpful. Thanks!
You assume in the example there is no completion for B with origin 2 iff there is a Leo item with transition symbol B in Earley set. In the presence of ambiguity, I don't think this is, in general, true.
Imagine your grammar had an additional rule:
C ::= 'y' 'z' 'w' 'x' 'y' 'z' 'w'
Then I think there would be a completion for B in Earley set 9 with origin 2, as well as the Leo item. Both would be part of parses of the input string.
I am gradually getting back into the logic, and I hope I got this right.
Here's the chart I get with that additional rule in place (no derivation codes, this time; I don't generate them programmatically). I don't see the completion you're referring to in there. I believe that's because Leo item 6 (it's the same in both charts) preempts the completion that would normally have been generated by combining item 7 with the completion of that C (39 in the new chart).
---------- 0 ----------
0: [0 A ::= [0 •'w' 'x' B
1: [0 A ::= [0 •'w'
---------- 1 ----------
2: [0 A ::= [0 'w' 1] •'x' B
3: [0 A 1] ::= [0 'w' 1] •
---------- 2 ----------
4: [0 A] ::= [0 'w' 'x' {2} B•B*
5: [0 A ::= [0 'w' {1} 'x' 2] •B
6: [0 A] ::= [0 'w' 'x' {2} B•C*
7: [2 B ::= [2 •C
8: [2 C ::= [2 •'y' 'z' A
9: [2 C ::= [2 •'y' 'z' 'w' 'x' 'y' 'z' 'w'
---------- 3 ----------
10: [2 C ::= [2 'y' 3] •'z' A
11: [2 C ::= [2 'y' 3] •'z' 'w' 'x' 'y' 'z' 'w'
---------- 4 ----------
12: [0 A] ::= [0 'w' 'x' {2} B•A*
13: [2 C ::= [2 'y' {3} 'z' 4] •A
14: [2 C ::= [2 'y' {3} 'z' 4] •'w' 'x' 'y' 'z' 'w'
15: [4 A ::= [4 •'w' 'x' B
16: [4 A ::= [4 •'w'
---------- 5 ----------
17: [2 C ::= [2 'y' 'z' {4} 'w' 5] •'x' 'y' 'z' 'w'
18: [4 A ::= [4 'w' 5] •'x' B
19: [0 A 5] ::= [0 'w' 'x' {4} B 5] •
20: [4 A 5] ::= [4 'w' 5] •
---------- 6 ----------
21: [0 A] ::= [0 'w' 'x' {2} B•B*
22: [4 A ::= [4 'w' {5} 'x' 6] •B
23: [0 A] ::= [0 'w' 'x' {2} B•C*
24: [6 B ::= [6 •C
25: [2 C ::= [2 'y' 'z' 'w' {5} 'x' 6] •'y' 'z' 'w'
26: [6 C ::= [6 •'y' 'z' A
27: [6 C ::= [6 •'y' 'z' 'w' 'x' 'y' 'z' 'w'
---------- 7 ----------
28: [2 C ::= [2 'y' 'z' 'w' 'x' {6} 'y' 7] •'z' 'w'
29: [6 C ::= [6 'y' 7] •'z' A
30: [6 C ::= [6 'y' 7] •'z' 'w' 'x' 'y' 'z' 'w'
---------- 8 ----------
31: [0 A] ::= [0 'w' 'x' {2} B•A*
32: [6 C ::= [6 'y' {7} 'z' 8] •A
33: [2 C ::= [2 'y' 'z' 'w' 'x' 'y' {7} 'z' 8] •'w'
34: [6 C ::= [6 'y' {7} 'z' 8] •'w' 'x' 'y' 'z' 'w'
35: [8 A ::= [8 •'w' 'x' B
36: [8 A ::= [8 •'w'
---------- 9 ----------
37: [6 C ::= [6 'y' 'z' {8} 'w' 9] •'x' 'y' 'z' 'w'
38: [8 A ::= [8 'w' 9] •'x' B
39: [2 C 9] ::= [2 'y' 'z' 'w' 'x' 'y' 'z' {8} 'w' 9] •
40: [0 A 9] ::= [0 'w' 'x' {2 8} B 9] •
42: [8 A 9] ::= [8 'w' 9] •
---------- 10 ----------
[EDIT 1] For completeness I added the other diagram for this case: [EDIT 2] I used to be throwing out useful predot origin information for items generated by Leo, which would make deriving the missing completions harder, so I've added that information. Now as you can see item 40 has 2 predot origins, the first corresponding to combining Leo6 with completion 39, and the 2nd by combining Leo31 with completion 42.
Set: 0 1 2 3 4 5 6 7 8 9
A ::= 'w' 'x' B....................................
0 2 5 40
B ::= C....................................
7
C ::= 'y' 'z' A........................
8 10 13
C ::= 'y' 'z' 'w' 'x' 'y' 'z' 'w'
9 11 14 17 25 28 33 39
A ::= 'w' 'x' B............
15 18 22
B ::= C............
24
C ::= 'y' 'z' A
26 29 32
A ::= 'w'
36 42
I think your implementation might miss this parse:
A -> 'w' 'x' B
-> 'w' 'x' C
-> 'w' 'x' 'y' 'z' 'w' 'x' 'y' 'z' 'w'
That is, it's not only pre-empting the Leo completions, it's pre-empting others and losing parses.
I don't think it's losing that parse. The same procedure that finds the one parse in the original example finds both of the parses in your updated example. We explore both of the completion/Leo matchups in Earley set 9. In the new chart, we have completion 40 matching both the Earley41 / Leo31 pair (the old parse) and the Earley39/Leo6 pair. The source of Leo6 is item 7, which is the tail of the (one-element) derivation path for the missing B completion.
I'll run these 2 grammars under Marpa::R2 and compare. It may be the 2 algorithms just work differently.
Look forward to seeing the results. Did my explanation just above make sense to you?
It may, once I compare the Earley sets from the 2 implementations.
Sent from ProtonMail mobile
-------- Original Message -------- On Dec 4, 2022, 5:40 PM, Dave Abrahams wrote:
Look forward to seeing the results. Did my explanation just above make sense to you?
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were assigned.Message ID: @.***>
=== Progress @0 ===
P0 @0-0 L0c0 A -> . 'w' 'x' B
P1 @0-0 L0c0 A -> . 'w'
P5 @0-0 L0c0 :start -> . A
=== Progress @1 ===
R0:1 @0-1 L1c1 A -> 'w' . 'x' B
F1 @0-1 L1c1 A -> 'w' .
F5 @0-1 L1c1 :start -> A .
=== Progress @2 ===
R0:2 @0-2 L1c1-3 A -> 'w' 'x' . B
P2 @2-2 L1c3 B -> . C
P3 @2-2 L1c3 C -> . 'y' 'z' A
P4 @2-2 L1c3 C -> . 'y' 'z' 'w' 'x' 'y' 'z' 'w'
Leo item for transition on "B"
Leo item for transition on "C"
=== Progress @3 ===
R3:1 @2-3 L1c3-5 C -> 'y' . 'z' A
R4:1 @2-3 L1c3-5 C -> 'y' . 'z' 'w' 'x' 'y' 'z' 'w'
=== Progress @4 ===
P0 @4-4 L1c7 A -> . 'w' 'x' B
P1 @4-4 L1c7 A -> . 'w'
R3:2 @2-4 L1c3-7 C -> 'y' 'z' . A
R4:2 @2-4 L1c3-7 C -> 'y' 'z' . 'w' 'x' 'y' 'z' 'w'
Leo item for transition on "A"
=== Progress @5 ===
R0:1 @4-5 L1c7-9 A -> 'w' . 'x' B
F0 @0-5 L1c1-9 A -> 'w' 'x' B .
F1 @4-5 L1c7-9 A -> 'w' .
F2 @2-5 L1c3-9 B -> C .
F3 @2-5 L1c3-9 C -> 'y' 'z' A .
R4:3 @2-5 L1c3-9 C -> 'y' 'z' 'w' . 'x' 'y' 'z' 'w'
F5 @0-5 L1c1-9 :start -> A .
=== Progress @6 ===
R0:2 @4-6 L1c7-11 A -> 'w' 'x' . B
P2 @6-6 L1c11 B -> . C
P3 @6-6 L1c11 C -> . 'y' 'z' A
P4 @6-6 L1c11 C -> . 'y' 'z' 'w' 'x' 'y' 'z' 'w'
R4:4 @2-6 L1c3-11 C -> 'y' 'z' 'w' 'x' . 'y' 'z' 'w'
Leo item for transition on "B"
Leo item for transition on "C"
=== Progress @7 ===
R3:1 @6-7 L1c11-13 C -> 'y' . 'z' A
R4:1 @6-7 L1c11-13 C -> 'y' . 'z' 'w' 'x' 'y' 'z' 'w'
R4:5 @2-7 L1c3-13 C -> 'y' 'z' 'w' 'x' 'y' . 'z' 'w'
=== Progress @8 ===
P0 @8-8 L1c15 A -> . 'w' 'x' B
P1 @8-8 L1c15 A -> . 'w'
R3:2 @6-8 L1c11-15 C -> 'y' 'z' . A
R4:2 @6-8 L1c11-15 C -> 'y' 'z' . 'w' 'x' 'y' 'z' 'w'
R4:6 @2-8 L1c3-15 C -> 'y' 'z' 'w' 'x' 'y' 'z' . 'w'
Leo item for transition on "A"
=== Progress @9 ===
R0:1 @8-9 L1c15-17 A -> 'w' . 'x' B
F0 x2 @0,4-9 L1c1-17 A -> 'w' 'x' B .
F1 @8-9 L1c15-17 A -> 'w' .
F2 x2 @2,6-9 L1c3-17 B -> C .
F3 x2 @2,6-9 L1c3-17 C -> 'y' 'z' A .
R4:3 @6-9 L1c11-17 C -> 'y' 'z' 'w' . 'x' 'y' 'z' 'w'
F4 @2-9 L1c3-17 C -> 'y' 'z' 'w' 'x' 'y' 'z' 'w' .
F5 @0-9 L1c1-17 :start -> A .
Above is the output for the extended grammar from Marpa::R2. The items with the augment symbol <:start>
can be ignored. Looks the same except for these two lines in Earley set 9:
F2 x2 @2,6-9 L1c3-17 B -> C .
F3 x2 @2,6-9 L1c3-17 C -> 'y' 'z' A .
Note the x2
in the above --- these 2 lines represent 4 Earley items.
Btw, an augment symbol is something you will probably want. It prevents the start symbol from being recursive -- this causes problems, which is why use of an augment symbol is almost universal.
Going back to comment https://github.com/jeffreykegler/Marpa--R2/issues/288, I didn't follow it at all. In any case, the two items referenced match items in my R2 output. Your completion 40, 40: [0 A 9] ::= [0 'w' 'x' {2} B 9] •
, is my F0 x2 @0,4-9 L1c1-17 A -> 'w' 'x' B .
And your item 7, 7: [2 B ::= [2 •C
is my P2 @2-2 L1c3 B -> . C
.
Looks the same except for these two lines in Earley set 9
A couple of questions:
F2 x2 @2,6-9 L1c3-17 B -> C .
F3 x2 @2,6-9 L1c3-17 C -> 'y' 'z' A .
If I understand your notation, and the algorithm, correctly, the first of these could only have been generated by two things:
Going back to comment https://github.com/jeffreykegler/Marpa--R2/issues/288, I didn't follow it at all
Okay, I can try to write out a complete pseudocode for finding all the missing completions, or you can ask a question whose answer will help you understand, or we can proceed some other way that you suggest. Please let me know what you want to do.
Regarding the augment symbol, I figured I'd add one at such a time I found it to be necessary. So far, I haven't. If you have an example that demonstrates why it's needed, I'd love to see it.
The format of my progress lines is documented here. In the two lines in question the x2 @2,6-9
means the lines represents two Earley items, both ending at earleme 9. One starts at earleme 2 and the other at earleme 6. The two Earley items F2 x2 @2,6-9 L1c3-17 B -> C .
have the two Earley items F3 x2 @2,6-9 L1c3-17 C -> 'y' 'z' A .
as their tributaries. Of the two Earley items F3 x2 @2,6-9 L1c3-17 C -> 'y' 'z' A
, the one starting at 2 has one of the Earley items in F0 x2 @0,4-9 L1c1-17 A -> 'w' 'x' B .
as its tributary, and the one starting at 6 has the Earley item F1 @8-9 L1c15-17 A -> 'w' .
as its tributary.
By the way, use has convinced me that both my terminology, and the traditional, for the causation of Leo and Earley items is sub-optimal, and I'm thinking of switching to fluvial terms. These are not overloaded, are fairly intuitive and are part of a rich vocabulary. When two rivers meet one is the mainstem and the other is a tributary. I am thinking of calling predecessors "mainstems", and what I had variously been calling components or causes, "tributaries".
Leo's paper, and his results, assume that the grammar is augmented. As for an actual example of why, if I run across one in literature I'll pass it on. Though if you don't add an augment symbol, you might be the first to generate an example showing why one is needed. :-)
Wrt comment https://github.com/jeffreykegler/Marpa--R2/issues/288, I'm inclined to ignore the issue. My implementation generates both the Earley items in question, along with all the same tributaries and mainstems as your implementation does, so it is not clear that there's an actual issue at stake in https://github.com/jeffreykegler/Marpa--R2/issues/288.
OTOH, pseudo-code of your approach to finding missing completions might be helpful and I'd be glad to read it.
Currently absent from Marpa::R2 is a high-level trace output for Leo items. There are diagnostics, but they use the internal symbols and rules. My original thought was that Leo items, unlike Earley items, were an internal issue, and an internals-ish trace sufficed. But experience, as exemplified by this discussion, shows that not to be the case.
Since I have an example to hand, this is an excellent opportunity to add a more application-developer trace facility for Leo items to Marpa::R2, and that is underway.
The two Earley items F2 x2 @2,6-9 L1c3-17 B -> C . have the two Earley items F3 x2 @2,6-9 L1c3-17 C -> 'y' 'z' A . as their tributaries. Of the two Earley items F3 x2 @2,6-9 L1c3-17 C -> 'y' 'z' A , the one starting at 2 has one of the Earley items in F0 x2 @0,4-9 L1c1-17 A -> 'w' 'x' B . as its tributary, and the one starting at 6 has the Earley item F1 @8-9 L1c15-17 A -> 'w' . as its tributary.
OK I understand this, but the question behind my question remains: It seems to me that the the existence of Leo items with transition symbols A and B that accompany all Items of the form C-> 'y' 'z' •A and A -> 'w' 'x'•B respectively must prevent the completions of the C and A you cite from ever being generated. Why is that not the case? (Or, are these completions only reconstructed later as part of the very process we're discussing now?)
By the way, use has convinced me that both my terminology, and the traditional, for the causation of Leo and Earley items is sub-optimal, and I'm thinking of switching to fluvial terms. These are not overloaded, are fairly intuitive and are part of a rich vocabulary. When two rivers meet one is the mainstem and the other is a tributary. I am thinking of calling predecessors "mainstems", and what I had variously been calling components or causes, "tributaries".
🤷 I can work with either one. It all depends what you're shooting for: I think these terms (especially "mainstem") are pretty esoteric and will not be immediately obvious to many people. Either way, in all of your writing that I've read I don't think I ever came across the word component nor a definition of the word predecessor, so IMO a prominent definition of terms is more important than which words you choose.
Yes, the presence of the Leo items does prevent the addition of completions during recognition. At evaluation time, however, those which are accessible from the accept (aka top or start) Earley item must be present, in order to have a complete tree to traverse. There's a lot of reasons you want a complete tree -- one important one is that the missing nodes of the tree might have semantics attached.
Libmarpa adds the missing-but-accessible completions back in when it creates the bocage. And Marpa::R2's progress listings contain them because otherwise tracing the parse becomes very difficult.
Sent with Proton Mail secure email.
------- Original Message ------- On Monday, December 5th, 2022 at 3:30 PM, Dave Abrahams @.***> wrote:
The two Earley items F2 x2 @.(https://github.com/2),6-9 L1c3-17 B -> C . have the two Earley items F3 x2 @.(https://github.com/2),6-9 L1c3-17 C -> 'y' 'z' A . as their tributaries. Of the two Earley items F3 x2 @.(https://github.com/2),6-9 L1c3-17 C -> 'y' 'z' A , the one starting at 2 has one of the Earley items in F0 x2 @.(https://github.com/0),4-9 L1c1-17 A -> 'w' 'x' B . as its tributary, and the one starting at 6 has the Earley item F1 @.***(https://github.com/8-9) L1c15-17 A -> 'w' . as its tributary.
OK I understand this, but the question behind my question remains: It seems to me that the the existence of Leo items with transition symbols A and B that accompany all Items of the form C-> 'y' 'z' •A and A -> 'w' 'x'•B respectively must prevent the completions of the C and A you cite from ever being generated. Why is that not the case?
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were assigned.Message ID: @.***>
Predecessor and component (aka "cause") are defined for each operation in sections 5 and 6 of the arxiv.org paper.
At evaluation time, however, those which are accessible from the accept (aka top or start) Earley item must be present, in order to have a complete tree to traverse. There's a lot of reasons you want a complete tree -- one important one is that the missing nodes of the tree might have semantics attached.
I agree that you want a complete tree, but I disagree that I need to create those completions directly. I just need to store the location of the tail element of the derivation path for each completion, which amounts to much of the same information but can be generated on the fly as I get to the relevant part of a tree/forest. That way I don't have to do an additional pass over the whole tree to create them, and can just store the derivation tails relevant to what the user explores in the chart. It's not a big deal but I think it will be a bit more efficient in common cases.
I'm happy to see new approaches tried.
Sent from ProtonMail mobile
-------- Original Message -------- On Dec 5, 2022, 9:10 PM, Dave Abrahams wrote:
At evaluation time, however, those which are accessible from the accept (aka top or start) Earley item must be present, in order to have a complete tree to traverse. There's a lot of reasons you want a complete tree -- one important one is that the missing nodes of the tree might have semantics attached.
I agree that you want a complete tree, but I disagree that I need to create those completions directly. I just need to store the location of the tail element of the derivation path for each completion, which amounts to much of the same information but can be generated on the fly as I get to the relevant part of a tree/forest. That way I don't have to do an additional pass over the whole tree to create them, and can just store the derivation tails relevant to what the user explores in the chart. It's not a big deal but I think it will be a bit more efficient in common cases.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were assigned.Message ID: @.***>
@dabrahams I requested this issue to be opened in this repo, but it turns out it is not the best. I want to transfer it to the Marpa::R2 repo and fortunately Github has a COOOL new feature that does that sort of thing. If it's OK with you, I will transfer this issue.
Marpa::R2 is best, because this issue is motivating a lot of improvement in the Leo diagnostics in Marpa::R2.
be my guest!
The following file contains nothing new -- it is the same as in my comment above, except using the new Marpa::R2 diagnostic, so that there is more detail for the Leo items.
=== Progress @0 ===
P0 @0-0 L0c0 A -> . 'w' 'x' B
P1 @0-0 L0c0 A -> . 'w'
P5 @0-0 L0c0 :start -> . A
=== Progress @1 ===
R0:1 @0-1 L1c1 A -> 'w' . 'x' B
F1 @0-1 L1c1 A -> 'w' .
F5 @0-1 L1c1 :start -> A .
=== Progress @2 ===
R0:2 @0-2 L1c1-3 A -> 'w' 'x' . B
P2 @2-2 L1c3 B -> . C
P3 @2-2 L1c3 C -> . 'y' 'z' A
P4 @2-2 L1c3 C -> . 'y' 'z' 'w' 'x' 'y' 'z' 'w'
L2 [[A -> 'w' 'x' . B]; "B"; 0]
L2 [[A -> 'w' 'x' . B]; "C"; 2]
=== Progress @3 ===
R3:1 @2-3 L1c3-5 C -> 'y' . 'z' A
R4:1 @2-3 L1c3-5 C -> 'y' . 'z' 'w' 'x' 'y' 'z' 'w'
=== Progress @4 ===
P0 @4-4 L1c7 A -> . 'w' 'x' B
P1 @4-4 L1c7 A -> . 'w'
R3:2 @2-4 L1c3-7 C -> 'y' 'z' . A
R4:2 @2-4 L1c3-7 C -> 'y' 'z' . 'w' 'x' 'y' 'z' 'w'
L4 [[A -> 'w' 'x' . B]; "A"; 2]
=== Progress @5 ===
R0:1 @4-5 L1c7-9 A -> 'w' . 'x' B
F0 @0-5 L1c1-9 A -> 'w' 'x' B .
F1 @4-5 L1c7-9 A -> 'w' .
F2 @2-5 L1c3-9 B -> C .
F3 @2-5 L1c3-9 C -> 'y' 'z' A .
R4:3 @2-5 L1c3-9 C -> 'y' 'z' 'w' . 'x' 'y' 'z' 'w'
F5 @0-5 L1c1-9 :start -> A .
=== Progress @6 ===
R0:2 @4-6 L1c7-11 A -> 'w' 'x' . B
P2 @6-6 L1c11 B -> . C
P3 @6-6 L1c11 C -> . 'y' 'z' A
P4 @6-6 L1c11 C -> . 'y' 'z' 'w' 'x' 'y' 'z' 'w'
R4:4 @2-6 L1c3-11 C -> 'y' 'z' 'w' 'x' . 'y' 'z' 'w'
L6 [[A -> 'w' 'x' . B]; "B"; 4]
L6 [[A -> 'w' 'x' . B]; "C"; 6]
=== Progress @7 ===
R3:1 @6-7 L1c11-13 C -> 'y' . 'z' A
R4:1 @6-7 L1c11-13 C -> 'y' . 'z' 'w' 'x' 'y' 'z' 'w'
R4:5 @2-7 L1c3-13 C -> 'y' 'z' 'w' 'x' 'y' . 'z' 'w'
=== Progress @8 ===
P0 @8-8 L1c15 A -> . 'w' 'x' B
P1 @8-8 L1c15 A -> . 'w'
R3:2 @6-8 L1c11-15 C -> 'y' 'z' . A
R4:2 @6-8 L1c11-15 C -> 'y' 'z' . 'w' 'x' 'y' 'z' 'w'
R4:6 @2-8 L1c3-15 C -> 'y' 'z' 'w' 'x' 'y' 'z' . 'w'
L8 [[A -> 'w' 'x' . B]; "A"; 6]
=== Progress @9 ===
R0:1 @8-9 L1c15-17 A -> 'w' . 'x' B
F0 x2 @0,4-9 L1c1-17 A -> 'w' 'x' B .
F1 @8-9 L1c15-17 A -> 'w' .
F2 x2 @2,6-9 L1c3-17 B -> C .
F3 x2 @2,6-9 L1c3-17 C -> 'y' 'z' A .
R4:3 @6-9 L1c11-17 C -> 'y' 'z' 'w' . 'x' 'y' 'z' 'w'
F4 @2-9 L1c3-17 C -> 'y' 'z' 'w' 'x' 'y' 'z' 'w' .
F5 @0-9 L1c1-17 :start -> A .
@dabrahams, if I remember correctly, with the completion of my Marpa::R2 Leo diagnostics, we are done with this issue. Absent feedback to the contrary, I will close this issue in a couple of days.
Sure, this one should be closed, but don't you want another for the paper, with the original topic?
I made the changes in the paper repo already. The issue is https://github.com/jeffreykegler/Marpa-arxiv-paper/issues/11. Please report any problems with the changed version under that issue, or in a new issue in the arxiv paper repo.
For your convenience, I've uploaded a PDF of the revised paper: https://drive.google.com/file/d/1_2AmXs6PhEqDNe8gAeaolps6HKT1nrsq/view?usp=sharing . The revisions are mainly in sections 5 and 6. If you are interested in looking at it, please download it in the next couple of days, as I regularly clean up that Google Drive directory.
Per your comment, I will close this issue after a couple of days, unless something comes up.
Per https://github.com/jeffreykegler/Marpa--R2/issues/288#issuecomment-1345326259, I am closing this issue.
I believe I found a case for which my logic fails. Consider this similar grammar with the same input:
A ::= 'w' 'x' B | 'w'
B ::= 'y' 'z' 'w' 'x' 'y' 'z' 'w'
B ::= 'y' 'z' A
for which I get the following chart:
---------- 0 ----------
0: [0 A ::= [0 •'w' 'x' B
1: [0 A ::= [0 •'w'
---------- 1 ----------
2: [0 A ::= [0 'w' 1] •'x' B
3: [0 A 1] ::= [0 'w' 1] •
---------- 2 ----------
4: [0 A] ::= [0 'w' 'x' {2} B•B*
5: [0 A ::= [0 'w' {1} 'x' 2] •B
6: [2 B ::= [2 •'y' 'z' 'w' 'x' 'y' 'z' 'w'
7: [2 B ::= [2 •'y' 'z' A
---------- 3 ----------
8: [2 B ::= [2 'y' 3] •'z' 'w' 'x' 'y' 'z' 'w'
9: [2 B ::= [2 'y' 3] •'z' A
---------- 4 ----------
10: [0 A] ::= [0 'w' 'x' {2} B•A*
11: [2 B ::= [2 'y' {3} 'z' 4] •A
12: [2 B ::= [2 'y' {3} 'z' 4] •'w' 'x' 'y' 'z' 'w'
13: [4 A ::= [4 •'w' 'x' B
14: [4 A ::= [4 •'w'
---------- 5 ----------
15: [2 B ::= [2 'y' 'z' {4} 'w' 5] •'x' 'y' 'z' 'w'
16: [4 A ::= [4 'w' 5] •'x' B
17: [0 A 5] ::= [0 'w' 'x' {4} B 5] •
18: [4 A 5] ::= [4 'w' 5] •
---------- 6 ----------
19: [0 A] ::= [0 'w' 'x' {2} B•B*
20: [4 A ::= [4 'w' {5} 'x' 6] •B
21: [2 B ::= [2 'y' 'z' 'w' {5} 'x' 6] •'y' 'z' 'w'
22: [6 B ::= [6 •'y' 'z' 'w' 'x' 'y' 'z' 'w'
23: [6 B ::= [6 •'y' 'z' A
---------- 7 ----------
24: [2 B ::= [2 'y' 'z' 'w' 'x' {6} 'y' 7] •'z' 'w'
25: [6 B ::= [6 'y' 7] •'z' 'w' 'x' 'y' 'z' 'w'
26: [6 B ::= [6 'y' 7] •'z' A
---------- 8 ----------
27: [0 A] ::= [0 'w' 'x' {2} B•A*
28: [6 B ::= [6 'y' {7} 'z' 8] •A
29: [2 B ::= [2 'y' 'z' 'w' 'x' 'y' {7} 'z' 8] •'w'
30: [6 B ::= [6 'y' {7} 'z' 8] •'w' 'x' 'y' 'z' 'w'
31: [8 A ::= [8 •'w' 'x' B
32: [8 A ::= [8 •'w'
---------- 9 ----------
33: [6 B ::= [6 'y' 'z' {8} 'w' 9] •'x' 'y' 'z' 'w'
34: [8 A ::= [8 'w' 9] •'x' B
35: [2 B 9] ::= [2 'y' 'z' 'w' 'x' 'y' 'z' {8} 'w' 9] •
36: [0 A 9] ::= [0 'w' 'x' {2 8} B 9] •
38: [8 A 9] ::= [8 'w' 9] •
---------- 10 ----------
Set: 0 1 2 3 4 5 6 7 8 9
A ::= 'w' 'x' B....................................
0 2 5 36
B ::= 'y' 'z' A........................
7 9 11
B ::= 'y' 'z' 'w' 'x' 'y' 'z' 'w'
6 8 12 15 21 21 29 35
A ::= 'w' 'x' B............
13 16 20
B ::= 'y' 'z' A
23 26 28
A ::= 'w'
32 38
Item 36 (the completion of A over 2-9) is derived two ways:
Since there is a perfectly legitimate completion of B covering sets 2-9 (item 35), nothing would cause my earlier-stated procedure to consider reconstructing the Leo-eliminated completion of the derivation path for B made of items 7, 9, and 11.
It seems that in general, any completion that's actually found in the chart could have arisen in either way. I think I can adjust to account for this fact and will post an update here when I've thought the problem through.
@jeffreykegler mentioned that his description of how to analyze Leo chains might be too daunting. Here's what I'm planning to do:
Definition: A derivation path is a sequence, in Earley set order, of predecessor Earley items of a completion, not including the completion, even if that completion was optimized out via Leo.
Premise: In Earley set k we have a completion X = [A→𝜶B., i], where the origin of B is known to be j, but can find no completion for a rule B→𝜷 with origin j in the same Earley set.
This completion must have been optimized away by Leo. To find all completions missing from set k that would otherwise contribute to X:
X