Closed davidweichiang closed 3 years ago
I'm thinking it may be possible to fix #45 and #54 with a single "registry-merging" operation but maybe it'll be easier to think about that after the current PRs are merged.
(Just to check: should I hold off on reviewing this until the other PR goes through?)
I think it's safe to merge this.
Oh no I just thought of a terrible stupid edge case. Renaming during conjunction could cause a namespace collision. For instance, if fgg1 had a nonterminal X
, and fgg2 had a nonterminal Y
, and one of the FGGs just so happened to already have a terminal named <X,Y>
.
Maybe this is most easily solved by, like, requiring that terminal edge labels don't use the character <
? (Or switching to some other character, or maybe it just can't start with <
and end with >
, since I can see someone wanting to make a terminal called <5
.)
(Otherwise this PR looks good to me!)
I can think of a few solutions:
1) The paired nonterminal labels should be actual pairs, not strings formed using <
and >
.
2) It should be an error if a collision of the type you described occurs (which the user will be rather powerless to avoid)
3) Conjunction should check for collision and do something silly like add an extra pair of angle brackets until there's no collision.
I went with #3 but it can be changed later easily. The solution is to take the repr
of each nonterminal before pairing them. This results in names like <'X','Y'>
.
Old: pairing of X,Y
and Z
is <X,Y,Z>
and pairing of X
and Y,Z
is also <X,Y,Z>
New: these beomce <'X,Y','Z'>
and <'X','Y,Z'>
, respectively
ok to merge this one?
Sorry, I'm confused -- how does this prevent the issue? I don't see where it keeps adding extra angle brackets until the collision goes away. (Am I misunderstanding the f-string syntax? I've never seen the !r before.)
Also, I'm confused at the new/old. I thought in the old version you'd get <<X,Y>,Z>
and <X,<Y,Z>>
, and in the new version you'd get <<'X','Y'>,'Z'>
and <'X',<'Y','Z'>>
? I must be misunderstanding something about string construction.
In any case, I'm fine with solution 3.
I went with a different idea than what was exactly written in #3. The !r
format means to take the repr
of the object. In this case, it means to put quotes around the string, and to escape any quotes inside the string. This guarantees that no angle bracket or comma inside the string could possibly be confused for an angle bracket or command outside the string, so there's no possibility of collision.
In my example, it's the string X,Y
(not the pair <X,Y>
) that is being paired with Z
, which is ambiguous with X
paired with the string Y,Z
(not the pair <Y,Z>
).
The current scheme is admittedly going to be very ugly with nested pairings: X
paired with <'Y','Z'>
is <'X',"<'Y','Z'>">
and then further nestings will cause backslashes to appear. Maybe the quoting needs to be toned down a bit.
OK, I switched to a different method that does less quoting (but requires passing around a dictionary).
ok, how about now?
Closes #45 (again)