Closed essisyrjala closed 2 years ago
Dear Essi,
Without seeing your data, I guess that this is caused by the fact that the knots are placed at equidistant points in jm()
, which is a bit different from what was done in jointModel()
. Perhaps, you could try setting the knots yourself.
Dear Dimitris,
Thank you for your rapid answer.
Yes, it's true that knots are not at the same places. Actually, I haven't found a way to set them by myself in jm()
, like in jointModel()
function. The similar 'knots=c(1.75, 3.75)' gives a warning message. How can this be done in jm()?
However, I'm still confused since the knots are placed at equidistant points between the first and the last endpoint, not between the start and the end of the follow-up period. Because for me, this output with control=list(Bsplines_degree=0, base_hazard_segments=3))
seems like the first boundary knot would be placed at 0.7876454 years, although the follow-up (and exposure data) starts from 0:
summary(jmod_diab_ENERC_total_MJ_cons3)$control$knots$1 [1] 0.7876454 2.4417636 4.0958818 5.7500000
But if I manipulate the data by changing one endpoint to 0.01 years, the first boundary knot shifts to this time point, and knots are placed at equidistant points between 0.01 and 5.75 years:
summary(jmod_diab_ENERC_total_MJ_cons3_test1)$control$knots$1 [1] 0.010000 1.923333 3.836667 5.750000
And after this manipulation HR: 23.2 (19.0, 28.2) becomes as HR: 1.40 (95% CI: 1.01, 1.88). I know that the knots are placed at the different points in these cases, but I don't fully understand, how the follow-up time before the time 0.79 is taken into account in modeling?
With best regards, Essi
You're correct; I had forgotten to put a knots
control argument. It is now there, but you will need to install the development version from GitHub.
Also, indeed, the problem is caused by the manner JMbayes2:::knots
creates the knots. In particular, when the degree is zero, it starts from the minimum event time, whereas it would be better to start from zero. I made an update for this case - could you run your code again?
Thank you for your reply.
I installed the development version from GitHub and tried to fit a model with specific knot places, but now I got a warning message of Error in rep.int(nk, max(0L, ord - match(TRUE, dkn > 0))) : invalid 'times' value
. And if I try with linear or quadratic form of baseline hazard, the obtained warning message is Error in FUN(X[[i]], ...) : 'ord' must be positive integer, at most the number of knots
.
Could you install it and try again? I made another update on how the knots are computed when the degree is zero.
Thank you, I installed it again. And sorry for the delay in my answer.
Now I see, that the first boundary knot is placed at the beginning of the follow-up period (at 0) with control=list(Bsplines_degree=0,base_hazard_segments=3)
option. But for some reason, something strange happens in the end of the period, since knot-output of the baseline hazard is now:
> summary(jmod_diab_ENERC_total_MJ_cons3)$control$knots$
1 [1] 0.000000 1.654118 3.308236 4.962355
although my last observed case is at 5.73 years and censored observations are at 5.75 years. It seems that the whole time period is shifted left (towards beginning) by 0.79 years (as my first boundary knot was before at 0.79 years)?
In addition, I still could not make the user-specified knots working, with control=list(Bsplines_degree=0, knots=c(1.75,3.75))
or similarly with control=list(Bsplines_degree=0,base_hazard_segments=3,knots=c(1.75,3.75))
the function gives an error message of Error in rep.int(nk, max(0L, ord - match(TRUE, dkn > 0))) : invalid 'times' value
.
I made another small change - could you try again?
Thank you!
Unfortunately, I still get the same result as previously, with knots placing incorrectly at [1] 0.000000 1.654118 3.308236 4.962355
.
And the same error occurs when trying to set knots for piecewise-constant hazard by myself: Error in rep.int(nk, max(0L, ord - match(TRUE, dkn > 0))) : invalid 'times' value
. If trying to set the knots for piecewise-quadratic baseline hazard, error becomes as Error in FUN(X[[i]], ...) : 'ord' must be positive integer, at most the number of knots
.
This is strange! Can you check if JMbayes2:::knots
in your installation is the same as what you see here.
Also, sorry that I forgot to say that the control argument knots
should be a list.
Yes, it appears to be the same in my installation:
`> JMbayes2:::knots function (xl, xr, ndx, deg) { dx <- (xr - xl)/ndx start <- if (deg < 1) 0 else xl - deg dx end <- if (deg < 1) xr + 0.001 else xr + deg dx seq(start, end, by = dx) } <bytecode: 0x000001d559355bf0>
It should be something like jm(..., knots = list(c(0, 1.65, 3.31, 4.96, 7.05)))
.
When I try this example with the new version,
MixedFit <- lme(log(serBilir) ~ year, data = pbc2, random = ~ year | id)
pbc2.id$Time <- pbc2.id$years
pbc2.id$event <- as.numeric(pbc2.id$status != "alive")
CoxFit <- coxph(Surv(Time, event) ~ drug + age, data = pbc2.id)
JMFit1 <- jm(CoxFit, MixedFit, time_var = "year", Bsplines_degree = 0L,
base_hazard_segments = 5)
JMFit1$control$knots
I get
> JMFit1$control$knots
$`1`
[1] 0.000000 2.838681 5.677363 8.516044 11.354726 14.193407
which seems OK to me.
Okay, I see. Now I get knots
working! So now, compared to the jointModel
function, in jm
knot places should be given as a list, and also boundary knots should be given. Great, thank you!
However, considering your above example, the distribution for the event times is:
> summary(pbc2.id$years)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.1123 3.7085 6.2959 6.4112 8.8798 14.3057
and your last boundary knot is at 14.1934 years, instead of 14.3057 years (maximum value). The difference between those numbers is 0.1123, for which reason I suspect, that the whole time period for baseline hazard is shifting left (towards the beginning), not just the place for the first boundary knot. And since the minimum value of my event times is 0.79, instead of only 0.1123, the effect is more dramatic compared to the example with pbc dataset.
The new version should work.
Yes, it works! Perfect, thank you so much.
My one last further question (this is the last one, I promise): have you considered to allow the use of interval censored data with jm
function (or is it possible already)?
Interval censoring is already implemented in the package, though we have not tested it thoroughly yet.
Dear Professor Rizopoulos,
I have been using a lot the joint modeling packages developed by you, firstly the JM package, and currently the JMbayes2 package. I like to thank you for the important work you have done in this research area.
However, I faced a problem with the JMbayes2 package, due to baseline hazard estimation. I have fitted the joint models with baseline hazard set as piecewise constant with 2 knots (3 hazard segments; implemented by adding “control=list(Bsplines_degree=0, base_hazard_segments=3))”). Results have been plausible as long as I used the one particular endpoint, but when I started to fit models with a different endpoint, all the results (dependently on the exposure variable) became very strange.
To be clear, I am investigating the association between different longitudinally measured nutritional variables and the risk of type 1 diabetes. This problematic endpoint (type 1 diabetes, T1D) compared to the first one I used with plausible behavior (islet autoimmunity, IA) has
For example, the obtained hazard ratios (95% CI) for energy intake are 23.2 (19.0, 28.2) with piecewise constant, 2.18 (1.71, 2.78) with piecewise linear, and 1.46 (1.04, 2.02) with piecewise quadratic baseline hazard with 3 hazard segments. The last one agrees with the result obtained with jointModel function from the JM-package, where piecewise constant baseline hazard with knots at 1.75 and 3.75 years (in 5.75-year follow-up) was used.
However, I noticed that JMbayes2 may fit the piecewise constant baseline hazard between the first and the last censored or observed endpoint (by ignoring the time before the first endpoint?), since the boundary knots of the baseline hazard seems to be placed at 0.79 and 5.75 years for type 1 diabetes outcome. Here are the outputs relating to the baseline hazards of piecewise constant, piecewise linear and piecewise quadratic models (notice that the follow-up starts from time point 0):
`
1. Piecewise constant baseline hazard (with 2 knots)
2. Piecewise linear baseline hazard (with 2 knots)
3. Piecewise quadratic baseline hazard (with 2 knots)
But if I manually change the time of the endpoint of one child (censored or observed, this was done for the child having exposure data only from the first measurement at time 0, so I did not modify the exposure data) to time 0.01, I obtain a more rational result of HR: 1.40 (95% CI: 1.01, 1.88), and the same baseline hazard output for piecewise-constant hazard becomes as:
`
`
So, my question is: Is the baseline hazard estimation going wrong with piecewise-constant hazard? Is it ignoring the time before the first observed or censored endpoint in estimation?
Thank you in advance for the help!
With best regards, Essi Syrjälä