kawu / feature-structure

BSD 2-Clause "Simplified" License
0 stars 0 forks source link

Parsing & unification: can we guarantee that nodes of the input graphs are disjoint? #6

Closed kawu closed 9 years ago

kawu commented 10 years ago

Motivation: if it's not possible to make this assumption, we will generally have to rename (or prefix) nodes in both input graphs before performing unification. Edges will need to be modified as well. But that's not all -- apart from graphs, there are also accompanying structures which hold pointers (identifiers) to graph nodes and which take part in unification. These structures will have to be modified too. All in all, there's a lot of work (which probably lifts complexity to a higher level) which could be avoided if only we could assume "disjointness".

Proposition: all feature structures taking part in the parsing algorithm are introduced at the preprocessing stage. During the proper parsing phase no new feature structures are introduced.

Conclusion: if the proposition above is true, we can re-identify all feature structures in the preprocessing stage and therefore ensure that the assumption (disjointness of graph nodes) will hold during the entire algorithm.

kawu commented 10 years ago

O(n^2) instances of every rule have to be considered during parsing. It seems to not be an optimal solution, from the memory consumption point of view, to introduce all the instances at the same time.

It might be better to use a ,,factory'' monad which will produce new rule instances on the fly. But then, of course, we will need to implement the parsing algorithm over the factory monad.

kawu commented 10 years ago

It might be also worth noting that if we introduce rule instances at the preprocessing stage, we might reuse these instances when parsing different sentences. So the former solution has some advantages as well.

kawu commented 9 years ago

All this is relevant not only for parsing FS-adorned CFGs. However, again, parsing is not the goal of this library.