Tikquuss / meta_XLM

Cross-lingual Language Model (XLM) pretraining and Model-Agnostic Meta-Learning (MAML) for fast adaptation of deep networks
Other
20 stars 3 forks source link

No known abbreviations for language 'da', attempting fall-back to English version... #4

Closed sadanyh closed 2 years ago

sadanyh commented 2 years ago

Hi

Thank you very much for the tutorial. I am trying to train an XML with (TM) objective on dialectical Arabic, and modern standard Arabic. I use da and ms as the abbreviations for the languages. When I try to run the data.sh to preprocess the data I get this error:

No known abbreviations for language 'da', attempting fall-back to English version...

Is there any parameter that I need to change? I changed the names of the languages to da and ms in the "apply_bpe_preprocess.sh". But still get the error. Thank you for your help

sadanyh commented 2 years ago

I managed to solve this problem by choosing a different abbreviation to my language pairs, ones that are already there.

Thanks

Tikquuss commented 2 years ago

This is not an error as such, just an alert. Take a look here.

You don't have a specialized tokenizer for your language in moses, so the English one is used by default.