Open sharrison5 opened 3 years ago
Excellent thoughts. The current model is more focusing on understanding what is happening, and not geared towards actual predictive application.
p(PDD within 4 years | age, z, hallucinations, updrs_III)
. I'm not actually a fan of this approach when it comes to global z as some people have a high probability of converting within a year while others are more likely to convert 2-3 years later. The way you have described is how I think things should be done (have p(apathy| t , ... )
and then can determine probability of developing apathy in any given window).baseline_motor, baseline_cognition, etc
, all interacting with t
. This is how any such model would be applied in practice - when predicting future times all you know are the baseline values. All the longitudinal data is still included in the model due to the point above.For now we are primarily interested in interpretability. If there was a benefit to being able to best predict individuals that will become apathetic (i.e., clinical trials, effective treatments) then a black-box approach can definitely have utility.
This is really useful, thank you!! 😄
For my benefit as much as anything else, I'll just quickly rehash one key distinction as your comments have made a few things drop into place. There are two contexts in which I've been using the term predictive, but they're really quite distinct:
p(apathetic | t + ∆t, motor_scores(t + ∆t), ...)
then we could say e.g. that lower motor scores are associated with a higher prevalence of apathy, but not that the motor scores are temporally predictive.I think the outstanding question is whether we want to investigate a set of 'temporally predictive' variables, as well as / instead of the current associative model (I think this is still a good starting point). I'll have a think about what those terms could look like. The obvious ones would be e.g.:
t
and baseline scores as you suggest.p(apathetic | t + ∆t, motor_scores(t), ...)
).Excellent! 🥇
Currently, the proposal is to look at apathy status over time via a logistic regression (where 'over time' means that temporally varying measurement-level variables are included in the model, and the prediction is per session), and testing via LOOIC.
Two other approaches we could take are:
∆t
years.For the former, the simpler model is (arguably) a bit more intuitive: predicting what happens to a patient given their current status and age / cognitive score / etc. However, it also feels like the models are closely related: for a given time since diagnosis (
t
) our model would givep(apathetic | t, ...)
, and we could get something like Kyla's metric via e.g.p(developing apathy | ∆t, ...) = (p(apathetic | t + ∆t, ...) - p(apathetic | t, ...)) / p(!apathetic | t, ...)
(this is an oversimplification: assuming for the sake of argument a positiveβ
ont
, and that other metrics get worse over time, so that remission is negligible). In other words, the 'development' model can be thought of as a (normalised) slice through the full model at a specific timepoint. In that context, we would then talk about the e.g. interactions between subject-level variables and years since diagnosis as being the 'risk factors' for developing apathy.p(apathetic | t + ∆t, ...)
, in the presence of other measurement-level variables: it is really something more likep(apathetic | t + ∆t, motor_scores(t + ∆t), ...)
. That then has a non-trivial dependence on changes in other metrics, in a way that means it's not such a pure predictor (we could presumably marginalise over future unseen motor scores etc., but that is introducing more complexity).For the classification approach, we're basically trading off interpretability for flexibility (if we went for say a GP / kernel regression / kernel SVM / etc approach). Are there any obvious disadvantages / is it redundant to have a look at that approach (this would be more as a potential side project, and wouldn't change the core analysis).
Thanks!!
@cleheron @zenourn @m-macaskill