facebookresearch / MUSE

A library for Multilingual Unsupervised or Supervised word Embeddings
Other
3.18k stars 544 forks source link

Delimiter for bilingual dictionaries is different for European language pairs, and English language pairs #142

Open map222 opened 4 years ago

map222 commented 4 years ago

Small thing I discovered: The European bilingual dictionaries are separated by a space, e.g for en-es.txt:

the el

However, for the English bilingual dictionaries for all other languages, they are separated by tabs. E.g. for en-vi.txt, the words are separated by tabs (can't get tab formatting in Markdown).

This caused me a minor problem parsing the dictionaries, as I was not expecting different delimiters.