rhiever / name-age-calculator

Analyzes a name and guesses the age range of a person with that name.
http://rhiever.github.io/name-age-calculator/
43 stars 8 forks source link

How did you get the statistics of all names? #2

Open olivercqc opened 6 years ago

olivercqc commented 6 years ago

Hello, name-age-calculator is a great work and I really appreciate that! I have read your blog about this and I want to build a project similar to this! I'm just wondering how to calculate the"Median" , "25th_Percentile", and "75th_Percentile" of a name? What formula can I use or where can I download these data? Can you tell me how did you come up with the idea of using this formula? Looking forward to your favorable reply!

rhiever commented 6 years ago

See this post and the FiveThirtyEight article linked in it. Hopefully that helps.

olivercqc commented 6 years ago

Thanks for replying me, I've read the post and the FiveThirtyEight article linked in it, but I still can't figure out how to calculate the statistics of all names, maybe my understanding very is poor…… emmmmm.... can u tell me how did u get the data of "name-stats.txt"? how did u calculate it? can u tell me a formula? Looking forward to your favorable reply!

rhiever commented 6 years ago

For every name in the baby name database from the U.S. Social Security Administration, I have a count of the number of babies given that name every year. Thus I'm able to plot out the distribution of babies given a particular name over several years. From that distribution, I can calculate the median, 25th percentile, and 75th percentile of that distribution, which I use to guess the age range.

The above analysis is repeated for every name & gender pair, and the results of that are stored in this GitHub repo.