Closed alexanderkoller closed 5 years ago
@namednil check other corpora, then close.
It indeed works for UCCA, I still have to check the other corpora.
Since we don't carry around the input for AMR, I decided to restore the "input" field at evaluation time. By giving the command line argument "--input path/to/input.mrp" to EvaluateAMR (or EvaluateMRP) we now enter the correct value there and -- for formalisms with anchoring -- remove illegal anchors, printing a warning that we did so.
After further discussion on https://github.com/cfmrp/mtool/issues/64, the correct thing to do is to use the original value of the "input" field in the MRP files the parser produces. The bug in #48 suggests that at some point(s?) in our code, we simply concatenate the tokens in the companion data with spaces and put those in the "input" field. This is incorrect and dangerous.
We need to make sure that we use the correct strings in the "input" fields.