rhiever / name-age-calculator

Analyzes a name and guesses the age range of a person with that name.
http://rhiever.github.io/name-age-calculator/
43 stars 8 forks source link

How did you clean up last names in the SSA dataset #3

Open andrewgodman opened 4 years ago

andrewgodman commented 4 years ago

I've found there are last names with low occurrences. Like: Goodman

On the raw data I found there was a few dating back to the 30s.

PS, if you know of a good dataset of Last names I would be keen as I've been working on name detection.

rhiever commented 4 years ago

Hi @andrewgodman,

This dataset is purely of first names. I'm not aware of a dataset for last names.

andrewgodman commented 4 years ago

@rhiever Thanks, oddly enough I have found some last names in the SSA data set. One example is the file yob1919.txt containing: Goodman,M,5 but I can not find this name in your data set. This is a good thing :) I've been having issues with the data for gender by name that is using the SSA data as well: https://data.world/howarder/gender-by-name