Various formats have a notion of edge features, which the current schema doesn't support very well.
In the UD importer, a simple dependency relation is two features:
word 1
UD:lemma(str): the
UD:head(ref): 2
UD:deprel(str): det
word 2
UD:lemma(str): penguin
but an enhanced dependency becomes a whole separate unit:
word 1
UD:lemma(str): the
word 2
UD:lemma(str): penguin
UD-edep 3
UD:parent(ref): 2
UD:child(ref): 1
UD:deprel(str): det
This strikes me as an ugly hack, in addition to probably being highly unintuitive for anyone trying to use the data.
One potential solution is to have a second table of features which links to the relations table rather than the units table. The question then arises of whether they should have a separate tiers table as well, but I think this might be unnecessary (unless we wanted to have edge feature names keyed on the parent and child types).
Various formats have a notion of edge features, which the current schema doesn't support very well.
In the UD importer, a simple dependency relation is two features:
but an enhanced dependency becomes a whole separate unit:
This strikes me as an ugly hack, in addition to probably being highly unintuitive for anyone trying to use the data.
One potential solution is to have a second table of features which links to the
relations
table rather than theunits
table. The question then arises of whether they should have a separatetiers
table as well, but I think this might be unnecessary (unless we wanted to have edge feature names keyed on the parent and child types).