Whilst verifying results and evaluating models I noticed that if you run the decoder over a file which has the entities marked up (as used for testing) you end up with both sets of information in the output file. It would be nice if it could strip out any markup as used for the training when decoding, or otherwise flag the two distinct sets. This was all done wrt the basic text files.
Other formats may have alternative ways to express both in a single output file.
Whilst verifying results and evaluating models I noticed that if you run the decoder over a file which has the entities marked up (as used for testing) you end up with both sets of information in the output file. It would be nice if it could strip out any markup as used for the training when decoding, or otherwise flag the two distinct sets. This was all done wrt the basic text files.
Other formats may have alternative ways to express both in a single output file.