Truncated results for XLM-R and mBERT - Githubissues

masakhane-io / masakhane-ner

Other

104 stars 51 forks source link

Truncated results for XLM-R and mBERT #16

Closed neubig closed 3 years ago

neubig commented 3 years ago

Hi,

It seems that some of the prediction files truncate sentences too short. For example, here is a long sentence in Hausa in the test file:

https://github.com/masakhane-io/masakhane-ner/blob/main/data/hau/test.txt#L6216

but this sentence is truncated in the XLM-R results:

https://github.com/masakhane-io/masakhane-ner/blob/main/entity_analysis/XLM-R/hau_xlmr_test_predictions.txt#L6216

Here's a similar result for mBERT:

Maybe you need to increase the maximum sequence length in whatever software you're using to be able to handle the whole sentences?

dadelani commented 3 years ago

thank you for pointing this out. The truncation of predicted tags is now resolved