Motivation-and-Behaviour / sleepIPD_analysis

Analysis for the sleep and physical activity pooled study (https://osf.io/gzj9w/)
Other
0 stars 0 forks source link

Testing for bidirectional relationships, confirming approach. Have we covered this? Do we need to perform SEM? #56

Closed conig closed 1 year ago

conig commented 1 year ago

We will test for bidirectional associations (Research Question 3) and nonlinear relationships when exploring the association between physical activity and sleep variables. This may include the use of generalised additive models, if appropriate.

We currently do this simply by usnig two sets of regressions

Excercise -> sleep and previous_sleep -> excercise

Therefore, by answering research question 1, and research question 2, we answer reseach question 3.

We need to confirm we are going to lock this in. Note this does not use generalised additive models as is briefly mentioned in the protocol. Do we know who suggested GAMs?

GAMs look great for non-linearity but I can't see how they are going to help with bidirectionality. Example:

library(targets)
library(mgcv)
library(tidymv)

library(ggplot2)

tar_load(data_holdout)

model <- gam(sleep_duration ~ s(pa_volume, by = sex) + sex, data = data_holdout)

predict_gam(model) |> 
  ggplot(aes(pa_volume, fit)) +
  geom_point(data = data_holdout, aes(x = pa_volume, y = sleep_duration, colour = sex), alpha = 0.01) +
  geom_smooth_ci(sex) +
  theme_bw() + labs(y = "Sleep duration", x = "PA volume") +
  guides(colour = "none")

image

conig commented 1 year ago

We need to discuss if GAMs help us handle RQ3, over what we already have. It seems to me like they serve the same funciton as the polynomial equations.

tarensanders commented 1 year ago

So I don't know why it says GAMs.

I think we are already testing for bi-directionality by specifying the model in both directions. Since it's not the same variables (because of the lag/lead thing) this isn't just reverse causality. If both models are significant, that's an indicator for bi-directionality. This is what one of our PhD students did, and there were no concerns in peer-review (which is a low bar, I know).

When I mentioned this to Chris, he suggested SEM (because of course he did, he's a psych). It would look something like:

flowchart LR
    PA1 --> PA2 & Sleep1
    PA2 --> PA3 & Sleep2
    PA3 --> PA4 & Sleep3
    PA4 --> Sleep4
    Sleep1 --> PA2 & Sleep2
    Sleep2 --> PA3 & Sleep3
    Sleep3 --> PA4 & Sleep4

I don't really know if we need to do this though, since it's not what we said in the protocol.

conig commented 1 year ago

I think we can take this one back to the team once Tim organises. I agree that SEM makes sense for the aim. But we suggested this right at the beginning, and I remember some were concerned that the journals we are targeting may not be familiar with SEM.

conig commented 1 year ago

If we were going to do this we'd only be able to include a few cycles as half the sample only has three recorded measurement days

> data_clean$measurement_day |> gsub(".*_","", x = _ ) |>  as.numeric() |> quantile(na.rm = TRUE)
  0%  25%  50%  75% 100% 
   0    1    3    6  116 
tarensanders commented 1 year ago

Let me double check those very high measurement day values. Those are suspect.

tarensanders commented 1 year ago

There's some cases where a row has no PA data and no sleep data. That seems crazy. Just waiting on the new imputations to finish then I'll push a fix. It might also smooth out those remaining bumps in the imputations.

tarensanders commented 1 year ago

@conig satisfied that we can close this? We can let the reviewers decide if we need a different approach.

conig commented 1 year ago

Yes I agree, closed.