lvaudor / mixr

4 stars 0 forks source link

Missing words in the English lexicon #1

Closed lbajemon closed 7 months ago

lbajemon commented 7 months ago

Hi,

By using the mixr package to get an English lexicon, I noticed that some words are missing from your database. These are mainly adverbs, nouns related to water (the theme of my corpus) and words in British English (for example urbanisation is not recognized but urbanization is). This is the command I used to get the lexicon: lexicon_en = mixr::get_lexicon("en")

I made a list of the missing words (approx. 1000) and completed their corresponding lemma, using the Oxford English Dictionary when I had a doubt. However, I did not complete their type as it is quite time consuming and I will not use this variable. Could you possibly add these words to the lexicon ? missing_words_lexicon_en.csv

Thank you !

lvaudor commented 7 months ago

Hi, It's done, the words are added to the English lexicon with type="unspecified" (we can document the type when we have some time or possibly never, just keeping in mind that these unspecified terms are of interest to us ;-) ).

lvaudor commented 7 months ago

1001 more lines in English lexicon with type="unspecified"

lbajemon commented 6 months ago

Hi,

I updated my corpus and found new words missing from the dictionary. Here is my list: missing_words_lexicon_en2.csv

Thank you :)