Open danielhers opened 4 years ago
One subtlety is that, because F
nodes are moved under the root, we are left with superfluous C
nodes:
[F The] [H [P [C service] ] ... [D poor] [U ...] ] [F is]
Should they be removed? I.e.:
[F The] [H [P service ] ... [D poor] [U ...] ] [F is]
Scoring P
and C
separately here (in an edge-based evaluation) would seem inconsistent with the notion of ignoring where F
attaches.
Yes, I think normalization (including C-flattening) should occur again after moving Fs.
Should moving all Fs be part of normalization? For structures like [S [F the] [C xyz]] it would make it more transparent that xyz is evoking a scene.
Also: the confusion matrix code should match the F-score computation
Evaluation is by spans, and if there is a non-empty intersection of the categories, then the span is considered correct. This is a problem because parsers can just predict many unary edges or multi-category edges and not be penalized for it: https://github.com/danielhers/ucca/blob/master/ucca/evaluation.py#L102 @omriabnd @nschneid