ordinal analysis suggestions

lcrouch1952 commented 4 years ago

My comment is that the lorenz curves that were calculated with ASER data are a bit odd. Lots of equality. Here are some suggestions of things to check. Of course, it may be that this is real. But here are some things to check. I am not smart enough or patient enough to read the *.do code, so maybe you already do those things. In that case just disregard these comments.

First, perhaps use only a single age or a single grade, if you can.

Second, maybe experiment with "ordinalizing" the PRIMR data (or TUSOME: Tim: which should she use? I would suggest using the most robust, well-behaved of the two data sets you analyzed). And then analyze with the ineqdec0 command and compare. There are two possible ways to ordinalize. The first is to break the ORF data up into, say, n/5 groups by sorting, where n = however many ordinal levels ASER has. Then assign the kids in each to the right group. The second is to break them up into groups of the same SIZE as those ordinal groups in ASER. (Fixing on a grade or age, say, Grade 2 or 3 ASER kids, or if you have only age, then, say, 7 or 8.) So, sort the ORF, and then assign the bottom X% of kids the first ordinal level, so that X% is sample size first ordinal level in ASER, etc., and so on.

Then see how those two graph, in comparison with how ORF itself graphs, using the PRIMR (or Tusome) data.

Another way to check robustness is that you'd expect to see much greater inequalty among, say, 8 year olds than among 15. I don't know how much lower inequality, but A LOT. Half as much? A third as much?

Tim, Mavzuna, if you think that these are helpful common sense suggestions for how to check whether this ordinality stuff is workking out, let me know and I will think of a few more.

lcrouch1952 commented 4 years ago

Also of course if you don't understand what I am saying, we can talk on the phone, or I can attempt to lay this out much more carefully or give an example.

TSSlade commented 4 years ago

@mavzunat - do the above suggestions look actionable to you?

In re: which dataset to use - I don't think it'll make a big difference. I'd just suggest writing the script so that the analysis is passed a dataset defined in a local macro, so you can easily swap in/out differing analyses. (That'll also give us a way to easily triangulate what we're seeing.)

mavzunat commented 4 years ago

Second, maybe experiment with "ordinalizing" the PRIMR data (or TUSOME: Tim: which should she use? I would suggest using the most robust, well-behaved of the two data sets you analyzed). And then analyze with the ineqdec0 command and compare.

I "ordinalized" the variable for both languages, both grades in both datasets and generated results. Please see the /inequality results/summary table (V02) in the dropbox: https://www.dropbox.com/home/dukeInternInequalityOutputs/Mavzuna?preview=CS20-mt-S01-V02-inequality_results.xlsx.

The gini coef. looks very different compared to what we had before. Luis, did you mean I need to use ineqord command to compare the results. ineqdec0 is designed for continuous variables, isn't it?

TSSlade commented 4 years ago

Yes, ineqdec0 is continuous only. So in this case you'd at least need to use ineqord. Other differences may be due to other issues, but that one at least we ought to rectify.

TSSlade / unesco_equity

ordinal analysis suggestions #10