Open TheresaSchmidt opened 2 years ago
- Double-check parser evaluation script: does it look at tokens that are not tagged?
As far as I can tell, the parser evaluation only looks at the HEAD and REL columns - unless @siyutao you changed anything?
- Did this also happen with the old parser?
Yes it did; see for example sausage_gravy_3 (token 78 'add') in round2_allennlp08_tagged_parsed
- Are there edges pointing to O-tagged tokens in the gold data?
Doesn't seem like it from this script.
Edges to and from O-tagged tokens can be removed in post-processing (easily). We should probably discuss / find out how meaningful they are and it would be nice if the parser didn't generate them in the first place.
(Oops, apparently, this didn't send the other day.) Edges to and from O-tagged tokens can be removed in post-processing (easily). We should probably discuss / find out how meaningful they are and it would be nice if the parser didn't generate them in the first place.
meaningful
E.g. whether we should delete the loose edges connected to non-nodes or make the non-nodes nodes. Currently theloose edges get ignored because scripts like reduce_graph.py look only at phrases with non-O tags to find edges.
The most recent parsed data has edges pointing to tokens (mostly determina) that are tagged with O, i.e. tokens that shouldn't be in the graph at all because they are not nodes. a) This really shouldn't happen. b) Immediate Todo's: