theonaunheim / surgeo

Open Source Proxy Demographic module written in Python
MIT License
32 stars 16 forks source link

Implement BIFSG #12

Closed TheCleric closed 3 years ago

TheCleric commented 3 years ago

Would there be any interest in attempting to implement the improved BIFSG model that includes first name data as well?

See https://www.tandfonline.com/doi/full/10.1080/2330443X.2018.1427012

While the overall magnitude of the improvement associated with BIFSG is somewhat modest, the largest improvements occur for NH Blacks, which is the group for which BISG is least accurate. Moreover, the improvement for NH Blacks is much higher where geography has low ability to distinguish NH Blacks. This aspect is particularly important as much of the research on the topic of racial/ethnic differences focuses on specific geographic areas rather than the entire United States. It is also worthwhile to note that the improvements of BIFSG over BISG are generally comparable to the improvements of BISG over simpler methods. Last but not least, when assessing the degree of improvement from BIFSG, one should consider that even the most advanced methods are likely to result in incremental improvements for Hispanics and NH Asians, given that surnames alone are highly predictive for these particular groups.

I wouldn't mind submitting a PR on this if there is interest.

theonaunheim commented 3 years ago

Thanks for your interest, @TheCleric .

Tentative yes. Presuming there's a relaible, public data source it shouldn't be difficult to add a new model class. I will take a look this weekend and provide an update.

TheCleric commented 3 years ago

Since I needed it for my own project, I went ahead and started adding it. I should have a PR for you later. The data source that the paper I linked to references is this one:

https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/TYJKEZ

I have already created a .ipynb notebook like you have for downloading it and transforming it.

theonaunheim commented 3 years ago

@TheCleric , many thanks for your help; I greatly appreciate it.

I merged your PR and made some minor edits. Please let me know if you would like to see any changes.