Open myoung3 opened 3 years ago
Thanks for looking into this, @myoung3 . I am very interested in providing both a convenient and efficient interface to cyclops
for time-varying covariates. Naturally, efficiency includes both in terms of "space" (as you bring up above) and "time" (compute speed that may decrease dramatically with the extra layer of memory-indirection).
Could I entice you to work on this further with my group?
Do you have a specific use-case in mind where performance / memory-usage becomes an issue?
From the release package documentation
The correct way to dealing with timevarying data in a cox model is to split each individual's follow-up period into multiple intervals at each change in their covariate value. Thus a time-varying dataset for cox analysis would have more than [edit] 1 row per person, and the above data spec would require the covariates object to have the same row length as the outcome object. In the case of a cox model with both time-varying and time-invariant variables, all of the time-invariant values would need to be repeated for every interval within participant. A more efficient data structure would allow a time-invariant covariate object which would join to the outcome object on participant id, along with a time-varying covariates object which would link to the outcome on both participant id and time.