wellner / jcarafe

BSD 3-Clause "New" or "Revised" License
14 stars 1 forks source link

Producing decoded output from files with existing entity markup #8

Open antonyscerri opened 12 years ago

antonyscerri commented 12 years ago

Whilst verifying results and evaluating models I noticed that if you run the decoder over a file which has the entities marked up (as used for testing) you end up with both sets of information in the output file. It would be nice if it could strip out any markup as used for the training when decoding, or otherwise flag the two distinct sets. This was all done wrt the basic text files.

Other formats may have alternative ways to express both in a single output file.