Open fedarko opened 4 years ago
Yea, I need to write up a blog post on this - that'll be up within the next 3 weeks
Was chatting with @antgonza today about formula stuff, and I found this video from one of the Patsy devs -- it does a super good job explaining both categorical encodings and intercept stuff.
A few relevant timestamps:
For "normal" uses of Patsy the intercept is the mean of whatever the "reference" group is, and everything else represents differences from this mean. So e.g. in the OLS example data on the screen at around 6:40, the Intercept coefficient (group 1 reference) is 46.4583, and the group 2 coefficient is 11.5417. And when you set group 2 as the reference instead, the group 1 coefficient is -11.5417 (because things have been flipped now), and the group 2 coefficient is 58 (aka 46.4583 + 11.5417).
I'm not quite sure how this translates to an interpretation of the Intercept
differentials you get, but at the very least it'd be good to add a link to this video to the README in the future.
Thanks for raising this issue, fedarko! I had the same question.
This has come up before, but I'm making it an issue here so it's officially written down somewhere.
From discussion with @antgonza and many other people :) Relates to biocore/qurro#229.