Closed leo-desbureaux-tellae closed 4 months ago
Hi! Thanks a lot, I'm at hEART this week and a bit busy the week after, but I'll look at it asap!
Thanks a lot, LGTM
Issues from your first review are fixed ! Let me know if there is something more that needs changes.
Resolved conflict in changelog
Thanks, looks all good, you can merge!
Perfect, thank you very much @leo-desbureaux-tellae, thanks @sebhoerl for the review !! ⛵ 🚀
Introduction of the Bhepop2 package for income assignment
data.income.municipality
The distributions DataFrame returned by data.income.municipality is now tagged by attribute and modality Attributes describes a property present on the the population agents. A modality is a value taken by this attribute. In Eqasim, we use two attributes:
New columns of the returned DataFrame are ["commune_id", "q1", "q2", "q3", "q4", "q5", "q6", "q7", "q8", "q9", "attribute", "modality", "is_imputed", "is_missing", "reference_median"] Global distributions (those that were returned in the previous version of municipality.py) are tagged with attribute and modality "all".
synthesis.population.income
New config option "income_assignation_method" (should it be "income_assignment_method" ?). This config allows choosing the method used to assign an income to population agents. The former method is called via the config "uniform" (what should be the default config ?). A new assignation method has been added, called via the config "bhepop2". This method uses the Bhepop2 package to match per attribute distributions instead of just matching the global one.
analysis.methods.income.compare_methods
A new analysis module has been added to compare the assignation methods. It can be run using the path
analysis.methods.income.compare_methods
. This module generates plots comparing income distributions of each assignation method and the source distribution (here, Filosofi data). This comparison is done per attribute. For instance, we compare the income distribution of individuals with attribute "family_comp" equal to "Single_parent" for the two methods, and see what method matches best the source distribution.Another output of this module is a table measuring the distance of each method to the source distribution, here again per attribute. This allows a more measurable comparison between assignation methods.