Closed trans closed 12 years ago
You'll have to ask Peter Norvig (http://norvig.com/) about that: it's his data. He's the director of research at Google and a careful and trustworthy guy, so I trust the data. Here's the original source if you're interested:http://norvig.com/ngrams/count_1w.txt. Linked from this page: http://norvig.com/ngrams/
Re: letters and state abbreviations - you're more than welcome to take them out if you like... it's not that hard. I'm using this list for typing training, so I left them in.
I have a hard time believing "information" is more frequent than "when".
Also, there are numerous entries for single letters like "x" and state abbreviations like "sd", IMO are not useful entries.