nmrksic / counter-fitting

Counter-fitting Word Vectors to Linguistic Constraints
Apache License 2.0
145 stars 28 forks source link

Typos in the word embeddings? #10

Open sharon-gao opened 4 years ago

sharon-gao commented 4 years ago

Hi,

Thanks for your dedication of this work!

I think there are some typos in the generated embedding file. For example, 'recieve' and 'recieved', while 'receive' and 'received' also appear. Besides, I guess there are some words such as 'worden' and 'viens' which look like vocabularies taken from other languages other than English. It could be a problem when I want to generate synonyms to replace words in the original sentences.

Do you have any idea to avoid these words?