Open cosgriffc opened 4 years ago
Sorry, the code for pearson.msm
is messy! It's difficult for me to understand this issue though without a reproducible example - could you email me a minimal one, with simulated data if necessary?
Hello Dr. Jackson,
Thank you very much for your reply. I'll see if I can simulate data to reproduce it. The issue is right here in pearson.R:
## Groups based on time since initiation (not observation number)
timegroups.use <- min(length(unique(md$time)), timegroups)
md$timegroup <- qcut(md$time, timegroups.use) # HERE IS THE ISSUE
## Group time differences by quantiles within time since initiation
intervalq <- tapply(md$timeinterval[md$state %in% ndstates],
md$timegroup[md$state %in% ndstates],
## Categorise quantiles based only on full observations, not deaths or censoring.
function(x) quantile(x,probs=seq(0,1,1/intervalgroups)))
Basically the time values I'm using result in qcut
forming an empty bin.
I've actually found a bit of a hacky workaround. My time values are in days t=(0, 7, 14, 28). I've re-run my model changing those to minutes and adding a random amount of minutes to introduce noise. This fixes the issue because qcut doesn't accidentally make an empty bin, and I don't think it meaningfully changes the underlying model results/conclusion (since adding some fake minutes to the time shouldn't meaningfully impact anything).
Best, Chris
Hello Dr. Jackson,
Thank you very much for crafting MSM; I've been reading your papers and using this package for a new project and really appreciate all of your work. Today I've run into a funky issue.
On an earlier version of my data which did not have follow-up for everyone, the Pearson test ran without issue. Today I loaded up my new data and it reported
Error in pearson.msm(mc_msm, timegroups = 2) : Remove any absorbing-absorbing transitions from data and refit model
I understood the error; a few individuals who had died at timepoint=3 were included with timepoint=4 observations, but again simply logged as having died. I removed them and re-ran the model fitting procedure without issue. However, when I attempted to rerun the Pearson-like test
pearson.msm(mc_msm, timegroups=3)
I received the following error:Error in seq.default(0, 1, 1/n) : argument "n" is missing, with no default
It works for timegroups=2, but not 3. I spent a few hours debugging and I believe I have figured out the problem: in removing the absorbing observations I have shifted the distribution of the observation times (since all patients have timepoints 1 through 4). In doing so qcut(md$time, 3) in the Pearson code goes from returning a vector with levels of 1, 2, 3 to a vector with levels 1, (1, 2], [2, 3). In doing so the rest of the grouping is broken because a the group defined by (1, 2) is empty.
Having discovered the etiology I've no clue how to fix it without changing the underlying code. I was curious if you have ever encountered this before (and could suggest a remedy I may have missed, e.g. maybe my model is just being fit wrong). If not, the only solution I can think of would be to allow the Pearson function to allow as input a manual set of quantiles. I don't think this would be hard to do and if that is the only route I'll submit a pull request and try and do it.
Cheers, Chris