mainlp / xsid

Creative Commons Attribution Share Alike 4.0 International
2 stars 0 forks source link

Garbled BIO tags in training files #2

Open yvesscherrer opened 3 weeks ago

yvesscherrer commented 3 weeks ago

Hi! The en.train.conll as well as the projectedTrain files contain varying amounts of the three tags Orecurring_datetime, B-ecurring_datetime and I-ecurring_datetime. All versions seem affected, and both the fixed and unfixed files.

robvanderg commented 3 weeks ago

Thanks for the notification, I had already just heard about these, and assume that these were already in the original training data!, I have fixed them now in version 0.6.