Equivalence of alternative analyses

sharrison5 commented 3 years ago

Currently, the proposal is to look at apathy status over time via a logistic regression (where 'over time' means that temporally varying measurement-level variables are included in the model, and the prediction is per session), and testing via LOOIC.

Two other approaches we could take are:

Something similar to the analysis in Kyla's paper, where the aim is to assess whether there is evidence that different variables can predict whether a patient has developed apathy after a time span of ∆t years.
An even more flexible version of the above, where the aim is simply to use e.g. a machine learning approach, and just train a classifier to do that (so extending beyond glms).

For the former, the simpler model is (arguably) a bit more intuitive: predicting what happens to a patient given their current status and age / cognitive score / etc. However, it also feels like the models are closely related: for a given time since diagnosis (t) our model would give p(apathetic | t, ...), and we could get something like Kyla's metric via e.g. p(developing apathy | ∆t, ...) = (p(apathetic | t + ∆t, ...) - p(apathetic | t, ...)) / p(!apathetic | t, ...) (this is an oversimplification: assuming for the sake of argument a positive β on t, and that other metrics get worse over time, so that remission is negligible). In other words, the 'development' model can be thought of as a (normalised) slice through the full model at a specific timepoint. In that context, we would then talk about the e.g. interactions between subject-level variables and years since diagnosis as being the 'risk factors' for developing apathy.

Does that interpretation of the full model make sense? Or is it better to think about these as fundamentally different approaches?
Are we likely to be losing power by trying to fit a larger model? Fitting interaction terms is potentially a harder problem, though we do have more data available for the full formulation. Similarly, the development model makes a subtly different set of assumptions about e.g. patients that develop apathy but then go into remission.
One obvious subtlety in the full model is in the way we evaluate p(apathetic | t + ∆t, ...), in the presence of other measurement-level variables: it is really something more like p(apathetic | t + ∆t, motor_scores(t + ∆t), ...). That then has a non-trivial dependence on changes in other metrics, in a way that means it's not such a pure predictor (we could presumably marginalise over future unseen motor scores etc., but that is introducing more complexity).

For the classification approach, we're basically trading off interpretability for flexibility (if we went for say a GP / kernel regression / kernel SVM / etc approach). Are there any obvious disadvantages / is it redundant to have a look at that approach (this would be more as a potential side project, and wouldn't change the core analysis).

Thanks!!

@cleheron @zenourn @m-macaskill

zenourn commented 3 years ago

Excellent thoughts. The current model is more focusing on understanding what is happening, and not geared towards actual predictive application.

The analysis in Kyla's paper was strictly p(PDD within 4 years | age, z, hallucinations, updrs_III). I'm not actually a fan of this approach when it comes to global z as some people have a high probability of converting within a year while others are more likely to convert 2-3 years later. The way you have described is how I think things should be done (have p(apathy| t , ... ) and then can determine probability of developing apathy in any given window).
You actually have more data and greater power when adding the temporal aspect. You can treat subsequent sessions as additional "baselines" and get even more data points (take the non-independence into account in the hierarchical structure).
The way I'd approach this is to do it as baseline_motor, baseline_cognition, etc, all interacting with t. This is how any such model would be applied in practice - when predicting future times all you know are the baseline values. All the longitudinal data is still included in the model due to the point above.

For now we are primarily interested in interpretability. If there was a benefit to being able to best predict individuals that will become apathetic (i.e., clinical trials, effective treatments) then a black-box approach can definitely have utility.

sharrison5 commented 3 years ago

This is really useful, thank you!! 😄

For my benefit as much as anything else, I'll just quickly rehash one key distinction as your comments have made a few things drop into place. There are two contexts in which I've been using the term predictive, but they're really quite distinct:

LOOIC predictive: This is testing the model by its applicability to unseen subjects/sessions. With my cross-validation hat on, I'd probably refer to this as generalisability.
Temporally predictive: given the current data we have available for a subject, can we predict what will happen into the future? Again, this is something more akin to causality (at least in the Granger sense). If our model is p(apathetic | t + ∆t, motor_scores(t + ∆t), ...) then we could say e.g. that lower motor scores are associated with a higher prevalence of apathy, but not that the motor scores are temporally predictive.

I think the outstanding question is whether we want to investigate a set of 'temporally predictive' variables, as well as / instead of the current associative model (I think this is still a good starting point). I'll have a think about what those terms could look like. The obvious ones would be e.g.:

Interactions between t and baseline scores as you suggest.
Including the measures from the previous session in the model (i.e. p(apathetic | t + ∆t, motor_scores(t), ...)).
Etc.

zenourn commented 3 years ago

Excellent! 🥇

nzbri / pd-apathy

Equivalence of alternative analyses #7