n cases

n1 = sum(ccdat$y==1)

n controls

n0 = sum(ccdat$y==0)

intercept correction: population log-odds

ccdat$c0 = log((q0/(1-q0))/(n1/n0)) On the other hand, I suppose you add the minus sign to avoid convergence issues, or is there some other reason? Thanks

I think either way is fine, and I agree it should match the paper, so I've updated it in commit ddb004c883c0bd71206fad9942101a422f58de0f. I was using the basic formula I've seen from Greenland (among others), but they give the same result since the sample size cancels out in the ratio. The negative sign is necessary because the offset term goes on the right side of the model, so it takes the overestimate of the baseline odds (due to undersampling non-cases from the population), and reduces it. Otherwise it doesn't work correctly in R (you can confirm by looking at the estimated intercept in the "m1" model fit to the cohort and the intercept in the "m_interceptcorrected" model, which should be approximately equal).

alexpkeil1 / Case-control-causal-review

Comment on ccc_interceptcorrection #2

n cases

n controls

intercept correction: population log-odds