eqasim-org / ile-de-france

An open synthetic population of Île-de-France for agent-based transport simulation
GNU General Public License v2.0
47 stars 69 forks source link

feat: use bhepop2 package for income assignment #243

Closed leo-desbureaux-tellae closed 2 months ago

leo-desbureaux-tellae commented 3 months ago

Introduction of the Bhepop2 package for income assignment

data.income.municipality

The distributions DataFrame returned by data.income.municipality is now tagged by attribute and modality Attributes describes a property present on the the population agents. A modality is a value taken by this attribute. In Eqasim, we use two attributes:

New columns of the returned DataFrame are ["commune_id", "q1", "q2", "q3", "q4", "q5", "q6", "q7", "q8", "q9", "attribute", "modality", "is_imputed", "is_missing", "reference_median"] Global distributions (those that were returned in the previous version of municipality.py) are tagged with attribute and modality "all".

synthesis.population.income

New config option "income_assignation_method" (should it be "income_assignment_method" ?). This config allows choosing the method used to assign an income to population agents. The former method is called via the config "uniform" (what should be the default config ?). A new assignation method has been added, called via the config "bhepop2". This method uses the Bhepop2 package to match per attribute distributions instead of just matching the global one.

analysis.methods.income.compare_methods

A new analysis module has been added to compare the assignation methods. It can be run using the path analysis.methods.income.compare_methods. This module generates plots comparing income distributions of each assignation method and the source distribution (here, Filosofi data). This comparison is done per attribute. For instance, we compare the income distribution of individuals with attribute "family_comp" equal to "Single_parent" for the two methods, and see what method matches best the source distribution. family_comp-Single_parent

Another output of this module is a table measuring the distance of each method to the source distribution, here again per attribute. This allows a more measurable comparison between assignation methods.

sebhoerl commented 3 months ago

Hi! Thanks a lot, I'm at hEART this week and a bit busy the week after, but I'll look at it asap!

sebhoerl commented 3 months ago

Thanks a lot, LGTM

leo-desbureaux-tellae commented 2 months ago

Issues from your first review are fixed ! Let me know if there is something more that needs changes.

leo-desbureaux-tellae commented 2 months ago

Resolved conflict in changelog

sebhoerl commented 2 months ago

Thanks, looks all good, you can merge!

Nitnelav commented 2 months ago

Perfect, thank you very much @leo-desbureaux-tellae, thanks @sebhoerl for the review !! ⛵ 🚀