amir-zeldes / gum

Repository for the Georgetown University Multilayer Corpus (GUM)
https://gucorpling.org/gum/
Other
88 stars 50 forks source link

Cycles in enhanced graph #173

Open nschneid opened 1 year ago

nschneid commented 1 year ago

https://github.com/amir-zeldes/gum/blob/4777dc46ae2bdcf463513ea83fd071a5f84a8954/_build/utils/eng_enhance.ini#L153

https://github.com/amir-zeldes/gum/blob/4777dc46ae2bdcf463513ea83fd071a5f84a8954/_build/utils/eng_enhance.ini#L163

These comments reference avoiding making the graph cyclic, but I think cycles are actually OK, as mentioned here. We just want to avoid self-edges.

amir-zeldes commented 1 year ago

In principle, yes, the edeps specification allows cycles. In practice though, ANNIS can't handle cycles in the same component, which is a somewhat complicated construct to explain. The basic gist is that cycles between regular deps and edeps are fine, but not within one of them. Since the build bot doesn't materialize edeps which are identical to basic deps, we can largely get away with edeps as intended. The few exceptions are the ones noted in the script, and currently GUM edeps lack those edges in conllu as well. Moving forward it would probably be nice to have these in conllu and either remove them before it goes to ANNIS, or find a different way to represent them (e.g. have another layer "edeps2" or something, which would be allowed to conflict with other edeps, or use a label instead of an edge somehow)

nschneid commented 1 year ago

Moving forward it would probably be nice to have these in conllu and either remove them before it goes to ANNIS, or find a different way to represent them

+1