diprism / fggs

Factor Graph Grammars in Python
MIT License
13 stars 3 forks source link

Things that are still not convenient #140

Open davidweichiang opened 2 years ago

davidweichiang commented 2 years ago

We added some convenience functions but here are the things that are still rough:

davidweichiang commented 2 years ago

I think the first two problems (about the ordering of function calls) can be fixed if

Then we can write, instead of parser.py:131-135 and 168-170:

r = hrg.new_rule()
root = r.rhs.new_node('nonterminal')
r.rhs.new_edge('is_start', [root], is_terminal=True)
fgg.new_categorical_factor('is_start', ...) # this works because is_start got registered with fgg
r.rhs.new_edge('subtree', [root], is_nonterminal=True)
r.set_lhs('tree', []) # set both lhs and rhs.ext at the same time

I think that's more natural, right?

The sharing of the registry relates to #139 (maybe the registry should be a Labeling object instead of a mixin) and the simultaneous setting of lhs and rhs.ext relates to #79.

davidweichiang commented 2 years ago

Currently an EdgeLabel contains both a name and other information (terminal, type), which is inconvenient when working with FGGs because you have to write

el1 = EdgeLabel("edge label one", is_nonterminal=True, [nl1, nl2, nl3])

so that the code has two representations for the same thing, el1 and "edge label one". In practice, it's common to programmatically create lots of edge labels (say, one for every bigram), so there are three representations: a pair object, an EdgeLabel object, and a string.

126 cut down on this, but I think a big part of a real solution to the problems noted above will be to break EdgeLabel into two pieces: a name (let's say it's still a string for now) and EdgeProperties, which says whether an edge label is terminal and what its type is, but doesn't contain its name. Then you can write something like

fgg = FGG()
fgg.add_edge_label("edge label one", is_nonterminal=True, [nl1, nl2, nl3])

and any functions that want an edge label now take a string ("edge label one"). For example,

fgg.start = "edge label one"
e = Edge("edge label one", [v1, v2])