wadpac / GGIR

Code corresponding to R package GGIR
https://wadpac.github.io/GGIR/
Apache License 2.0
99 stars 62 forks source link

Ignore last night in part 5 when doing WW analysis and recording ends after midnight but before 7am #1196

Closed vincentvanhees closed 3 days ago

vincentvanhees commented 2 months ago

For a project where recordings sometimes end briefly after midnight GGIR still includes these nights in the part 5 output because part 4 sleep analysis was triggered and this is then used by GGIR.

I think these sleep estimates are not useful if we have timewindow = "WW".

So, I am now updating function g.part5.wakesleepwindows.R to skip the labelling of a night if k > 1 && windowSizeHours < 19 && "WW" %in% timewindow

jhmigueles commented 1 month ago

I have looked at this and I would propose handling this in the clean report generation, so that the analysis in conducted in the full recording as it is now. This has the advantage that we make sure not to break consistency in the analysis, and not to harm any other project as I think it's difficult to oversee if any other project would need of that last window analysed (e.g., studies using WW and only interested in awake outcomes from part 5?).

I have implemented a quick fix in this branch. In function getValidDayIndices, within g.report.part5, there is now a new condition to only include WW and OO windows if they are at least 19 hours long. I think this has the advantage that we still get all days, including not valid days as the scenario you describe, in the full report, and then we get the clean report for direct use in analyses. But I still have some concerns on this:

vincentvanhees commented 1 month ago

In the draft branch I created the focus is on the time difference between the final noon and the end of the recording, which aims to check whether recording ends at or after 7am. I now realise this can be simplified to just checking the final timestamp.

Your branch does not do this, it looks at the duration of the WW or OO window, which is not a good indicator of whether the final night is useful. For example, if a person wakes up at 6am in the before last day then having 19 hours of data would still mean that the final night ends at 1am and not useful.

New plan:

If I do this in time series output from g.part5.savetimeseries and in cleaned output for g.reportpart5 then it will be clear from all reports (csv and pdf) that those final windows are not considered in the cleaned output.

My proposal is to not make these thresholds modifiable. I am afraid that there are researchers who start playing with parameters to boost the number of days, just like some researchers were trying to include MM windows with less than 23 hours of data.

Update: In PR #1202 I now implemented this by adding a new parameters to trigger the omission of the last window if the last night is incomplete (default = FALSE). In this way it should not cause disturbance for existing analyses but it gives us the freedom to account for incomplete last nights.

vincentvanhees commented 1 month ago

Re-opening this issue because current approach is not intuitive when last night of the recording is already ignored in part4. In that case require_complete_lastnight_part5 = TRUE will ignore whatever is the last window, which can happen more than a day before the end of the recording. The exclusion of the last window should in that case be skipped.