[QUESTION] How to change the default Character map?

CAMeL-Lab / camel_tools

A suite of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi.

MIT License

413 stars 73 forks source link

You can change the default CharMapper with the norm_map argument but that will not fix this particular issue. Furthermore, norm_map is used to specify the normalization expected by the morphological database so for all the databases we provide this shouldn't be changed.

This is most likely a limitation of the disambiguation model (ie. the model has seen very few instances of the word in that particular context if at all). Can you tell us which disambiguation model you are using (MLE/BERT) and can you give us the example sentence this appears in?

We are working on implementing a new option to take into account the spelling of a word in the input (particularly input diacritics) and should help with such cases.

We don't have an exact timeline for this but we'll notify you here when it's done.

CAMeL-Lab / camel_tools

[QUESTION] How to change the default Character map? #143