dipetkov / actigraph.sleepr

Detect periods of sleep and non-wear from ActiGraph data
31 stars 8 forks source link

Choi algorithm failing due to timestamps, but they are correct #12

Open muschellij2 opened 3 months ago

muschellij2 commented 3 months ago

Here is a reprex of the issues that are occurring due to specific storage types that seems to be a "bug"

library(agcounts)
library(read.gt3x)
library(actigraph.sleepr)
library(curl)
#> Using libcurl 8.4.0 with LibreSSL/3.3.6
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

Downloading and reading in the data

library(tidyr)
gt3x_file = tempfile(fileext = ".gt3x")
# https://github.com/muschellij2/Wrist-Worn-Accelerometry-Processing-Pipeline
url = "https://figshare.com/ndownloader/files/47702005"
curl::curl_download(url = url, destfile = gt3x_file)

df = read.gt3x::read.gt3x(path = gt3x_file, 
                          asDataFrame = TRUE, 
                          imputeZeroes = TRUE)
df
#> Sampling Rate: 30Hz
#> Firmware Version: 1.9.2
#> Serial Number Prefix: MOS
#>                  time     X     Y      Z
#> 1 2017-10-30 15:00:00 0.188 0.145 -0.984
#> 2 2017-10-30 15:00:00 0.180 0.125 -0.988
#> 3 2017-10-30 15:00:00 0.184 0.121 -0.984
#> 4 2017-10-30 15:00:00 0.184 0.121 -0.992
#> 5 2017-10-30 15:00:00 0.184 0.117 -0.988
#> 6 2017-10-30 15:00:00 0.184 0.125 -0.988

Here we are using last observation carried forward to match idle sleep mode

# impute idle sleep mode
sample_rate = attr(df, "sample_rate")
acceleration_max = as.numeric(attr(df, "acceleration_max"))
df = dplyr::as_tibble(df)
df = df %>% 
  # find where all zeroes/imputed zeroes
  mutate(all_zero = X == 0 & Y == 0 & Z == 0) %>% 
  # replace all 0 with NA so it can be filled  
  mutate(
    X = ifelse(all_zero, NA_real_, X),
    Y = ifelse(all_zero, NA_real_, Y),
    Z = ifelse(all_zero, NA_real_, Z)
  )
any(df$all_zero)
#> [1] TRUE

Filling in the data

df = df %>% 
  select(-all_zero) %>% 
  tidyr::fill(X, Y, Z, .direction = "down")
head(df)
#> # A tibble: 6 × 4
#>   time                    X     Y      Z
#>   <dttm>              <dbl> <dbl>  <dbl>
#> 1 2017-10-30 15:00:00 0.188 0.145 -0.984
#> 2 2017-10-30 15:00:00 0.18  0.125 -0.988
#> 3 2017-10-30 15:00:00 0.184 0.121 -0.984
#> 4 2017-10-30 15:00:00 0.184 0.121 -0.992
#> 5 2017-10-30 15:00:00 0.184 0.117 -0.988
#> 6 2017-10-30 15:00:00 0.184 0.125 -0.988

Getting activity counts

ac60 = df %>% 
  agcounts::calculate_counts(epoch = 60L)

Error

# needed for `actigraph.sleepr`
ac60 = ac60 %>%
  rename(timestamp = time)
choi_nonwear = actigraph.sleepr::apply_choi(ac60)
#> Error: Missing timestamps. Epochs should be evenly spaced from first(timestamp) to last(timestamp).

We dig a bit and see that has_missing_epochs is coming up TRUE, but this seems to be a bug

# error happens at actigraph.sleepr:::check_no_missing_timestamps
has_missing_epochs(ac60)
#> [1] TRUE

Taking the code from has_missing_epochs_ and running it shows that it fails on identical:

# fais at actigraph.sleepr:::has_missing_epochs_
epoch_len <- get_epoch_length(ac60)
epochs <- seq(first(ac60$timestamp), last(ac60$timestamp), 
              by = epoch_len)
identical(epochs, ac60$timestamp)
#> [1] FALSE

But if we use all.equal we see that this returns TRUE

all.equal(epochs, ac60$timestamp)
#> [1] TRUE

And if we try == equality we see a TRUE

all(epochs == ac60$timestamp)
#> [1] TRUE
attributes(epochs)
#> $class
#> [1] "POSIXct" "POSIXt" 
#> 
#> $tzone
#> [1] "UTC"
attributes(ac60$timestamp)
#> $class
#> [1] "POSIXct" "POSIXt" 
#> 
#> $tzone
#> [1] "UTC"

There are no different attributes and conversion to numeric we get:

identical(as.numeric(epochs), as.numeric(ac60$timestamp))
#> TRUE

We see the issue is with their type, which seems relatively minor and a byproduct

typeof(ac60$timestamp)
#> [1] "integer"
typeof(epochs)
#> [1] "double"

And this discrepancy is potentially coming from the epoch_len, but definitely comes from the dplyr::first and dplyr::last:

typeof(ac60$timestamp[1])
#> [1] "integer"
typeof(first(ac60$timestamp))
#> [1] "double"

Created on 2024-07-15 with reprex v2.1.0

Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.4.0 (2024-04-24) #> os macOS Sonoma 14.4.1 #> system x86_64, darwin20 #> ui X11 #> language (EN) #> collate en_US.UTF-8 #> ctype en_US.UTF-8 #> tz America/New_York #> date 2024-07-15 #> pandoc 3.2 @ /usr/local/bin/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> actigraph.sleepr * 0.2.0 2023-07-31 [1] Github (dipetkov/actigraph.sleepr@e754679) #> agcounts * 0.6.8 2024-06-04 [1] local #> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.4.0) #> bit 4.0.5 2022-11-15 [1] CRAN (R 4.4.0) #> bit64 4.0.5 2020-08-30 [1] CRAN (R 4.4.0) #> blob 1.2.4 2023-03-17 [1] CRAN (R 4.4.0) #> bslib 0.7.0 2024-03-29 [1] CRAN (R 4.4.0) #> cachem 1.1.0 2024-05-16 [1] CRAN (R 4.4.0) #> cli 3.6.2 2023-12-11 [1] CRAN (R 4.4.0) #> colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.4.0) #> curl * 5.2.1 2024-03-01 [1] CRAN (R 4.4.0) #> data.table 1.15.4 2024-03-30 [1] CRAN (R 4.4.0) #> DBI 1.2.2 2024-02-16 [1] CRAN (R 4.4.0) #> digest 0.6.35 2024-03-11 [1] CRAN (R 4.4.0) #> dplyr * 1.1.4 2023-11-17 [1] CRAN (R 4.4.0) #> evaluate 0.23 2023-11-01 [1] CRAN (R 4.4.0) #> fansi 1.0.6 2023-12-08 [1] CRAN (R 4.4.0) #> fastmap 1.2.0 2024-05-15 [1] CRAN (R 4.4.0) #> fs 1.6.4 2024-04-25 [1] CRAN (R 4.4.0) #> generics 0.1.3 2022-07-05 [1] CRAN (R 4.4.0) #> GGIR 3.1-0 2024-06-28 [1] local #> ggplot2 3.5.1 2024-04-23 [1] CRAN (R 4.4.0) #> glue 1.7.0 2024-01-09 [1] CRAN (R 4.4.0) #> gsignal 0.3-5 2022-05-15 [1] CRAN (R 4.4.0) #> gtable 0.3.5 2024-04-22 [1] CRAN (R 4.4.0) #> htmltools 0.5.8.1 2024-04-04 [1] CRAN (R 4.4.0) #> htmlwidgets 1.6.4 2023-12-06 [1] CRAN (R 4.4.0) #> httpuv 1.6.15 2024-03-26 [1] CRAN (R 4.4.0) #> jquerylib 0.1.4 2021-04-26 [1] CRAN (R 4.4.0) #> jsonlite 1.8.8 2023-12-04 [1] CRAN (R 4.4.0) #> knitr 1.46 2024-04-06 [1] CRAN (R 4.4.0) #> later 1.3.2 2023-12-06 [1] CRAN (R 4.4.0) #> lattice 0.22-6 2024-03-20 [1] CRAN (R 4.4.0) #> lifecycle 1.0.4 2023-11-07 [1] CRAN (R 4.4.0) #> lubridate 1.9.3 2023-09-27 [1] CRAN (R 4.4.0) #> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.4.0) #> Matrix 1.7-0 2024-03-22 [1] CRAN (R 4.4.0) #> memoise 2.0.1 2021-11-26 [1] CRAN (R 4.4.0) #> mime 0.12 2021-09-28 [1] CRAN (R 4.4.0) #> munsell 0.5.1 2024-04-01 [1] CRAN (R 4.4.0) #> pillar 1.9.0 2023-03-22 [1] CRAN (R 4.4.0) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.4.0) #> png 0.1-8 2022-11-29 [1] CRAN (R 4.4.0) #> pracma 2.4.4 2023-11-10 [1] CRAN (R 4.4.0) #> promises 1.3.0 2024-04-05 [1] CRAN (R 4.4.0) #> purrr 1.0.2 2023-08-10 [1] CRAN (R 4.4.0) #> R.cache 0.16.0 2022-07-21 [1] CRAN (R 4.4.0) #> R.methodsS3 1.8.2 2022-06-13 [1] CRAN (R 4.4.0) #> R.oo 1.26.0 2024-01-24 [1] CRAN (R 4.4.0) #> R.utils 2.12.3 2023-11-18 [1] CRAN (R 4.4.0) #> R6 2.5.1 2021-08-19 [1] CRAN (R 4.4.0) #> Rcpp 1.0.12 2024-01-09 [1] CRAN (R 4.4.0) #> RcppRoll 0.3.0 2018-06-05 [1] CRAN (R 4.4.0) #> reactable 0.4.4 2023-03-12 [1] CRAN (R 4.4.0) #> read.gt3x * 1.2.0 2024-07-11 [1] local #> reprex 2.1.0 2024-01-11 [1] CRAN (R 4.4.0) #> reticulate 1.37.0 2024-05-21 [1] CRAN (R 4.4.0) #> rlang 1.1.3 2024-01-10 [1] CRAN (R 4.4.0) #> rmarkdown 2.27 2024-05-17 [1] CRAN (R 4.4.0) #> RSQLite 2.3.6 2024-03-31 [1] CRAN (R 4.4.0) #> rstudioapi 0.16.0 2024-03-24 [1] CRAN (R 4.4.0) #> sass 0.4.9 2024-03-15 [1] CRAN (R 4.4.0) #> scales 1.3.0 2023-11-28 [1] CRAN (R 4.4.0) #> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.4.0) #> shiny 1.8.1.1 2024-04-02 [1] CRAN (R 4.4.0) #> stringi 1.8.4 2024-05-06 [1] CRAN (R 4.4.0) #> stringr 1.5.1 2023-11-14 [1] CRAN (R 4.4.0) #> styler 1.10.3 2024-04-07 [1] CRAN (R 4.4.0) #> tibble 3.2.1 2023-03-20 [1] CRAN (R 4.4.0) #> tidyr * 1.3.1 2024-01-24 [1] CRAN (R 4.4.0) #> tidyselect 1.2.1 2024-03-11 [1] CRAN (R 4.4.0) #> timechange 0.3.0 2024-01-18 [1] CRAN (R 4.4.0) #> utf8 1.2.4 2023-10-22 [1] CRAN (R 4.4.0) #> vctrs 0.6.5 2023-12-01 [1] CRAN (R 4.4.0) #> withr 3.0.0 2024-01-16 [1] CRAN (R 4.4.0) #> xfun 0.44 2024-05-15 [1] CRAN (R 4.4.0) #> xtable 1.8-4 2019-04-21 [1] CRAN (R 4.4.0) #> yaml 2.3.8 2023-12-11 [1] CRAN (R 4.4.0) #> zoo 1.8-12 2023-04-13 [1] CRAN (R 4.4.0) #> #> [1] /Library/Frameworks/R.framework/Versions/4.4-x86_64/Resources/library #> #> ─ Python configuration ─────────────────────────────────────────────────────── #> python: /Users/johnmuschelli/miniconda3/bin/python3 #> libpython: /Users/johnmuschelli/miniconda3/lib/libpython3.11.dylib #> pythonhome: /Users/johnmuschelli/miniconda3:/Users/johnmuschelli/miniconda3 #> version: 3.11.4 (main, Jul 5 2023, 08:41:25) [Clang 14.0.6 ] #> numpy: /Users/johnmuschelli/miniconda3/lib/python3.11/site-packages/numpy #> numpy_version: 1.25.2 #> pygt3x: /Users/johnmuschelli/miniconda3/lib/python3.11/site-packages/pygt3x #> #> NOTE: Python version was forced by RETICULATE_PYTHON #> #> ────────────────────────────────────────────────────────────────────────────── ```
muschellij2 commented 3 months ago

I've tracked the issue down to vctrs:vec_slice: https://github.com/r-lib/vctrs/issues/1781. Workaround, must use typeof(timestamp_column) = "double" and rerun