best-cost / best-cost_WPs

This repository aim to collect the confidential (not public) work of the EU project BEST-COST in the framework of the workpackages (WPs).
1 stars 0 forks source link

Social inequalities #307

Closed luytax closed 5 days ago

luytax commented 1 month ago

How to include the social dimension in R package results:

luytax commented 1 month ago

Brecht just sent an email (8.10.2024 detailing the Decile/Quintile Share Ratio approach, and also a way to compare the PAF of the total population to the PAF of the most deprived population (see quote & screenshot below).

In our current work, we tend to use three indicators: Absolute difference: AB_highestdeprivation - AB_lowestdeprivation Relative difference: AB_highestdeprivation / AB_lowestdeprivation PAF: (AB_totalpopulation - AB_lowestdeprivation) / AB_totalpopulation

image

ungatoverde commented 1 month ago

Answer of Carl

Dear Brecht,

Thank you for sharing this reference. It is very relevant and aligns with some of the thoughts I’ve had. I, too, have been somewhat sceptical of using the SII and RII, although I have applied them on occasion. These metrics are helpfulwhen considering the full spectrum of inequality and when needing a single indicator, especially since they provide confidence intervals. However, as you demonstrate in your paper, their interpretation can be unclear, and they may bias findings in certaincontexts (this was especially interesting).

Simple differences and ratios, as you note, are straightforward to interpret and avoid some of these biases, though they only compare the extreme categories. Depending on the policy goals, this could either be advantageous or less useful,as it ignores the middle categories.

Using the PAF, as you suggest, alongside pairwise inequality indicators seems like a promising approach. It avoids the inherent issues of regression-based measures like the SII and RII, while still providing meaningful insights. I willincorporate the methods from paper into the initial assessments of the BEST-COST MDI and health outcomes.

Thanks again for this, and I look forward to continuing the discussion.

Kind regards,

Carl

ungatoverde commented 1 week ago

@arnopauwels thanks a lot for providing the quantitative example of your suggested method. I just moved it to here: r_package\testing\input\inequalities. I also split the R code that you provided in inequalities_example_belgium.R into Preparation of input data.Rmd and testing_Rpackage.Rmd as we did with other examples.

I have a question about the data set that you provided. Inside the table BIMD_2011_WITHOUT_HEALTH_ELLIS_WIDE.csv you have the following columns: CD_RES_SECTOR, score, rank, deciles. I guess that CD_RES_SECTOR is a geo_id, score is the deprivation index, but what about rank and deciles? We cannot assume that the users fo the R package will provide this informatoin.

@arnopauwels Could you please provide the R code to obtain these two last columns based on two first columns? This would be very thankful.

ungatoverde commented 1 week ago

@arnopauwels Ah, I see. You make first a ranking of deprivation index and group the municipalities into ten iqual groups. I will try to code it by myself

ungatoverde commented 6 days ago

@arnopauwels I have adapted your code on social inequalities to integrate it into our R package. In the branch inequalities, I got a version that works but gives different results. Digging in the method, I found something that could be the reason for the divergence.

The ranking and the deciles by municipality can be found here, but the calculation is done somewhere else. See code.

Anyway, this data set has 18764 rows.

After this, you filter the data mortality data set by region and remove rows without population. As a result, you get a data set with 8842 rows. See code.

And then you merge. See code.

This has a problem. The ranking belongs to data set that is larger than the mortality data set. I think that it is more reasonable to do it in the way round. First clean up data set and then assign ranking and deciles. Don't you think so?

See below some screenshots that illustrate the problem This is the table that you provided with the deprivation index, the score (ranking) and the decile. image The ranking and therefore the deciles were calculated based on ALL these rows (municipalities?). However, some of them will be excluded because they you have no population data for them and NUTS1 is not equal to BE2. Excluding first and then making the ranking and the deciles results in this different table image

arnopauwels commented 5 days ago

Hello Alberto,

It would indeed make sense to follow the approach you suggest (first clean up/merge the deprivation and mortality datasets, then compute the index by ranking and assigning to deciles).

In the script, I subset the data for Flanders (NUTS1 == 'BE2'; 8842 sectors) because the inequalities are greater if you look at them by region instead of Belgium (18764 sectors) as a whole, so I thought it would be a nicer example. However, I forgot that it is then necessary to redo the ranking and split into deciles, like you suggested.

Nice to see the progress that has been made despite my confusing example ;)

ungatoverde commented 5 days ago

@arnopauwels Thanks! Nice to see that we are on the same page :-)