first20hours / google-10000-english

This repo contains a list of the 10,000 most common English words in order of frequency, as determined by n-gram frequency analysis of the Google's Trillion Word Corpus.
Other
3.88k stars 1.93k forks source link

Not In Order by Frequency in English Language #13

Open totallyuneekname opened 7 years ago

totallyuneekname commented 7 years ago

I find it very hard to believe that "ebay" is the 217th most common word in the English language, and Google's ngram viewer agrees with me (words in linked ngram search appear AFTER "ebay" in "google-10000-english.txt").

Further analysis of the data, using the Google ngram viewer itself, indicates that the order of this list in no way represents the actual relative frequencies at which these words are used.

Of course this list is still useful to many who are looking for a list of common words, but I take issue with the claim that these are the "most common English words in order of frequency."