ajdamico / convey

variance of distribution measures estimation of survey data
GNU General Public License v3.0
17 stars 7 forks source link

Using influence functions to analyze measures #262

Closed guilhermejacob closed 2 years ago

guilhermejacob commented 7 years ago

Van Kerm (2015) presents 3 possible uses for influence functions:

  1. Variance Estimation
  2. Study the structure of summary statistics
  3. RIF regression

The first point we already do, but the second is particularly interesting. It can show where in the distribution a measure is most sensitive.

See also Essama-Nssah & Lambert (2011).

guilhermejacob commented 7 years ago

This is more a context than a convey thing. I t would be nice to show examples on how to use this information.

Do you agree, @DjalmaPessoa ?

DjalmaPessoa commented 7 years ago

I saw the slides from Van Kem (2015). The Influence functions plots shown there are useful in the context of robustness of estimators. The concept was proposed by Hampel for that. In convey we are using influence functions to get variance estimators. Could you give any interesting interpretation of these plots in the area of poverty and income concentration (not for statisticians)?

guilhermejacob commented 7 years ago

They can be used to show interesting aspects of the measures.

For instance, in context, we argue that: FGT(0) doesn't account for the depth of poverty; FGT(1) does, but doesn't for the inequality among the poor; and that FGT(2) does account for the three aspects.

Putting that in a (very rough) plot: image

This way, we can see that observations below the poverty line in FGT(0) are equally weighted, irrespective of whether they are $1 or $1000 below the poverty line. On the other hand FGT(1) and FGT(2) "influence" decrease as we approach the poverty line. The difference between FGT(1) and FGT(2) is the rate of decrease: while in FGT(1), the gap between income and poverty is linear, in FGT(2) the gap grows in a non-linear rate as income-poverty line gaps increase.

Also, we can use this kind of plot to argue why we should account for the variance in measures using relative poverty lines, as it would show that the non-poor influence the overall measure.

But you're the statistician! You tell me if that makes sense and if it is worth adding.

DjalmaPessoa commented 7 years ago

I believe it is worth adding as long as you can give interesting subject matter interpretations, like the one you just gave. Statisticians would be more interested in robustness questions like: what happen to the mean, median or some estimator when we let one observation go to infinite? This is not much relevant to convey.

guilhermejacob commented 6 years ago

@DjalmaPessoa , can I add a pointer to that kind of analysis in the Influence functions section?

DjalmaPessoa commented 6 years ago

I’m not sure I understand what you mean by a pointer.

Em 20 de nov de 2017, à(s) 17:35, Guilherme Jacob notifications@github.com escreveu:

@DjalmaPessoa , can I add a pointer to that kind of analysis in the Influence functions section?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

guilhermejacob commented 3 years ago

Related to #242