diprism / fggs

Factor Graph Grammars in Python
MIT License
13 stars 3 forks source link

Node/terminal collision during conjunction is an error #55

Closed davidweichiang closed 3 years ago

davidweichiang commented 3 years ago

Closes #45 (again)

davidweichiang commented 3 years ago

I'm thinking it may be possible to fix #45 and #54 with a single "registry-merging" operation but maybe it'll be easier to think about that after the current PRs are merged.

darcey commented 3 years ago

(Just to check: should I hold off on reviewing this until the other PR goes through?)

davidweichiang commented 3 years ago

I think it's safe to merge this.

darcey commented 3 years ago

Oh no I just thought of a terrible stupid edge case. Renaming during conjunction could cause a namespace collision. For instance, if fgg1 had a nonterminal X, and fgg2 had a nonterminal Y, and one of the FGGs just so happened to already have a terminal named <X,Y>.

Maybe this is most easily solved by, like, requiring that terminal edge labels don't use the character <? (Or switching to some other character, or maybe it just can't start with < and end with >, since I can see someone wanting to make a terminal called <5.)

darcey commented 3 years ago

(Otherwise this PR looks good to me!)

davidweichiang commented 3 years ago

I can think of a few solutions:

1) The paired nonterminal labels should be actual pairs, not strings formed using < and >.

2) It should be an error if a collision of the type you described occurs (which the user will be rather powerless to avoid)

3) Conjunction should check for collision and do something silly like add an extra pair of angle brackets until there's no collision.

davidweichiang commented 3 years ago

I went with #3 but it can be changed later easily. The solution is to take the repr of each nonterminal before pairing them. This results in names like <'X','Y'>.

Old: pairing of X,Y and Z is <X,Y,Z> and pairing of X and Y,Z is also <X,Y,Z> New: these beomce <'X,Y','Z'> and <'X','Y,Z'>, respectively

davidweichiang commented 3 years ago

ok to merge this one?

darcey commented 3 years ago

Sorry, I'm confused -- how does this prevent the issue? I don't see where it keeps adding extra angle brackets until the collision goes away. (Am I misunderstanding the f-string syntax? I've never seen the !r before.)

Also, I'm confused at the new/old. I thought in the old version you'd get <<X,Y>,Z> and <X,<Y,Z>>, and in the new version you'd get <<'X','Y'>,'Z'> and <'X',<'Y','Z'>>? I must be misunderstanding something about string construction.

In any case, I'm fine with solution 3.

davidweichiang commented 3 years ago

I went with a different idea than what was exactly written in #3. The !r format means to take the repr of the object. In this case, it means to put quotes around the string, and to escape any quotes inside the string. This guarantees that no angle bracket or comma inside the string could possibly be confused for an angle bracket or command outside the string, so there's no possibility of collision.

In my example, it's the string X,Y (not the pair <X,Y>) that is being paired with Z, which is ambiguous with X paired with the string Y,Z (not the pair <Y,Z>).

davidweichiang commented 3 years ago

The current scheme is admittedly going to be very ugly with nested pairings: X paired with <'Y','Z'> is <'X',"<'Y','Z'>"> and then further nestings will cause backslashes to appear. Maybe the quoting needs to be toned down a bit.

davidweichiang commented 3 years ago

OK, I switched to a different method that does less quoting (but requires passing around a dictionary).

davidweichiang commented 3 years ago

ok, how about now?