masakhane-io / masakhane-ner

Other
100 stars 51 forks source link

Poorly formatted line(s?) in masakhaner datasets #22

Closed neubig closed 1 year ago

neubig commented 1 year ago

Hello!

It seems that there is at least one poorly formatted line in the masakhaner-sna train split (the middle one below):

...
Hospital I-ORG
1487 Doctors'I-ORG
Association I-ORG
...

And one in the swa dataset:

. O
248 '
248 "

@dadelani seems like this should be fixed?

dadelani commented 1 year ago

Hello, Thank you for pointing the issues. They have been fixed.

neubig commented 1 year ago

Thanks a bunch!