Closed n400peanuts closed 1 year ago
Hi @jiqicn great that you did the test, it resulted indeed useful!! yuppie.
It's interesting that, systematically, 8 and 9 are failing. This definitely did not happen to my tests in R, so it's worth to double-check that 1. I exported correctly the file and repeat it again 2. look at the type of tests you are doing/cross tabs that you are comparing (it's possible that you are using cross-tabs which are different than mine, hence the mismatch -- note that I had to manually import the crosstabs from excel to R, so it's very likely that I could have made a mistake).
I'd say though that it's great that we found this -- I will fix it as soon as I have the time (I have a hard deadline for next monday and other things to work on before, I cannot guarantee it will be before your holiday, but don't worry about it I'll manage myself :))
Hi @n400peanuts, no need to be in a rush, you should take a rest when feeling sick!
BTW, maybe good to let you know, I used the crosstab that you sent to Kristina and Astrid with your R code on 03/07/2023, and I also import the crosstabs by hand, so it's also possible that I made the mistake but didn't realize.
It's ok, in the end if you found that columns 8 and 9 do not match, it means that I used the wrong info in generating the weights. In any case is fixable, and only a good news that you found the issue! :)
Hi @jiqicn, good news! All the tests are passing, and both your function for sampling from ipf and the file of the weights work perfectly (yuppieee).
It took me a couple of days to solve this issue and I have made an updated file. I felt that rather than changing yours, it was better to create another one directly since the changes I made are substantial.
So the problems that I found were:
chi2_contingency
rather than chisquare
. This is based on this stack exchange post which really opened my mind. This it the equivalent of the chisq.test
function I used in R.In general, I found it a bit hard to manually double-check so many cross-tabs without having the labels for their columns and rows. So I decided to put them in pandas data frames and then in a dictionary. I am aware that this is not the most elegant solution, and your numpy arrays + dict solution was better looking, but if there is an error somewhere I feel it's important to have the labels. In addition, I think there is no need to test all 43 cross tabs, as in the end these are particular combinations of 9 main variables, so testing only those 9 variables I think it's a good enough approximation. I also went for this solution because I thought my code was already complex enough, and I am sure that I could do better.
All in all, I think it would be great if you could do a code review on this solution! but in the meantime that you are on holiday I'm going to ask someone in the team.
cheers and have a nice holiday!
Hi @n400peanuts, I wrote the unit test to check if the generated sample set is aligned to the crosstab given by Kristina, and you can find the test file in the ipf_sample branch (you may want to remove the
parametrize
mark before merging it tomain
to avoid repeating tests for 100 times on GitHub Actions).I repeated the 100-time-test many times on my computer and usually only a few of them passed (see the screenshot below). Most of the time, it indicates an insignificant p-value for columns 8 and 9 which are
self-rated health
andcritical job
, but it sometimes also happened to columns 0 and 8 (gender
andself-rated health
).I also double-checked if there are typos in my code (e.g. giving the wrong crosstab information), but I can't find any.
I'm not sure if this testing result can be treated as proof that the weights generated by the IPF algorithm should be improved. What do you think?