Is your feature request related to a problem? Please describe.
Add an estimating equation for pooled (logistic) regression to support survival analysis operations. This is a finite-dimension M-estimator, so standard theory would apply. This also opens up various survival analysis options, like computing IPCW, g-computation, and others.
Describe the solution you'd like
Build an estimating equation for pooled logistic regression. Note that it would not require a long data set. Specifically, we should evaluate something like the following
$$\sum{i=1}^n \left( \sum{k \in R} (\Delta_i t_k - m(W_i, S_i; \beta)) \left[ W_i, S_i \right]^T \right) = 0$$
this makes a compact estimating equation which avoids the expansion into a long data set. This avoids mistakes potentially introduced in data processing steps (for the users). This is the advantage of working with the score! However, it requires some finesse to specify the estimating equation programmatically. Particularly, the design matrix for time (i.e., $S$) which is dependent on $k$.
Challenges here:
Need to process the time design matrix if we don't covert to a long data structure
Weights can be time-dependent, which complicates the implementation that doesn't require a long data structure (weights are a matrix instead of a vector in that case)
User can't directly control who contributes as the compact structure sums over the time internally. This can be modified by using the weights argument, but is more opaque.
Describe alternatives you've considered
Code from scratch each time (I would rather not, and would be good support for users).
Additional context
Abbott, R. D. (1985). Logistic regression in survival analysis. American Journal of Epidemiology, 121(3), 465-471.
D'Agostino, R. B., Lee, M. L., Belanger, A. J., Cupples, L. A., Anderson, K., & Kannel, W. B. (1990). Relation of pooled logistic regression to time dependent Cox regression analysis: the Framingham Heart Study. Statistics in Medicine, 9(12), 1501-1515.
Hernán, M. A. (2010). The hazards of hazard ratios. Epidemiology, 21(1), 13-15.
Ngwa, J. S., Cabral, H. J., Cheng, D. M., Pencina, M. J., Gagnon, D. R., LaValley, M. P., & Cupples, L. A. (2016). A comparison of time dependent Cox regression, pooled logistic regression and cross sectional pooling with simulations and an application to the Framingham Heart Study. BMC Medical Research Methodology, 16, 1-12.
Is your feature request related to a problem? Please describe.
Add an estimating equation for pooled (logistic) regression to support survival analysis operations. This is a finite-dimension M-estimator, so standard theory would apply. This also opens up various survival analysis options, like computing IPCW, g-computation, and others.
Describe the solution you'd like
Build an estimating equation for pooled logistic regression. Note that it would not require a long data set. Specifically, we should evaluate something like the following $$\sum{i=1}^n \left( \sum{k \in R} (\Delta_i t_k - m(W_i, S_i; \beta)) \left[ W_i, S_i \right]^T \right) = 0$$ this makes a compact estimating equation which avoids the expansion into a long data set. This avoids mistakes potentially introduced in data processing steps (for the users). This is the advantage of working with the score! However, it requires some finesse to specify the estimating equation programmatically. Particularly, the design matrix for time (i.e., $S$) which is dependent on $k$.
Challenges here:
Describe alternatives you've considered
Code from scratch each time (I would rather not, and would be good support for users).
Additional context
Abbott, R. D. (1985). Logistic regression in survival analysis. American Journal of Epidemiology, 121(3), 465-471.
D'Agostino, R. B., Lee, M. L., Belanger, A. J., Cupples, L. A., Anderson, K., & Kannel, W. B. (1990). Relation of pooled logistic regression to time dependent Cox regression analysis: the Framingham Heart Study. Statistics in Medicine, 9(12), 1501-1515.
Hernán, M. A. (2010). The hazards of hazard ratios. Epidemiology, 21(1), 13-15.
Ngwa, J. S., Cabral, H. J., Cheng, D. M., Pencina, M. J., Gagnon, D. R., LaValley, M. P., & Cupples, L. A. (2016). A comparison of time dependent Cox regression, pooled logistic regression and cross sectional pooling with simulations and an application to the Framingham Heart Study. BMC Medical Research Methodology, 16, 1-12.