actigraph / pygt3x

Python module for reading GT3X/AGDC file format data
GNU General Public License v3.0
14 stars 6 forks source link

Timestamp has no milliseconds #38

Closed muschellij2 closed 8 months ago

muschellij2 commented 9 months ago

Similar to #37, pygt3x can read in the data but the timestamp has no milliseconds

wget https://ndownloader.figshare.com/files/21855567 -O  AI3_CLE2B21130054_2017-06-02.gt3x.gz
gunzip AI3_CLE2B21130054_2017-06-02.gt3x.gz
from pygt3x.reader import FileReader

with FileReader("AI3_CLE2B21130054_2017-06-02.gt3x") as reader:
    df = reader.to_pandas()
    print(df.head(5))
#                      X         Y         Z
# Timestamp                                 
# 1.496405e+09 -0.739003  0.624633 -0.328446
# 1.496405e+09 -0.478006  0.785924 -0.175953
# 1.496405e+09 -0.126100  0.686217 -0.219941
# 1.496405e+09  0.633431  0.595308 -0.706745
# 1.496405e+09  0.803519  0.551320 -0.809384

The index values have no subseconds:

df.index.values    
#  array([1.4964048e+09, 1.4964048e+09, 1.4964048e+09, ..., 1.4970096e+09,
#       1.4970096e+09, 1.4970096e+09])
df.index.values.__mod__(1.0)
# array([0., 0., 0., ..., 0., 0., 0.])
df.index.values.__mod__(1.0) * 100
# array([0., 0., 0., ..., 0., 0., 0.])

Showing first 3 values are the same:

df.index.values[0]
# 1496404800.0
df.index.values[1]
# 1496404800.0
df.index.values[2]
# 1496404800.0

R code reproduction

library(curl)
#> Using libcurl 7.79.1 with LibreSSL/3.3.6
library(agcounts)
library(R.utils)
#> Loading required package: R.oo
#> Loading required package: R.methodsS3
#> R.methodsS3 v1.8.2 (2022-06-13 22:00:14 UTC) successfully loaded. See ?R.methodsS3 for help.
#> R.oo v1.25.0 (2022-06-12 02:20:02 UTC) successfully loaded. See ?R.oo for help.
#> 
#> Attaching package: 'R.oo'
#> The following object is masked from 'package:R.methodsS3':
#> 
#>     throw
#> The following objects are masked from 'package:methods':
#> 
#>     getClasses, getMethods
#> The following objects are masked from 'package:base':
#> 
#>     attach, detach, load, save
#> R.utils v2.12.2 (2022-11-11 22:00:03 UTC) successfully loaded. See ?R.utils for help.
#> 
#> Attaching package: 'R.utils'
#> The following object is masked from 'package:utils':
#> 
#>     timestamp
#> The following objects are masked from 'package:base':
#> 
#>     cat, commandArgs, getOption, isOpen, nullfile, parse, warnings
options(digits.secs = 3)
url = "https://ndownloader.figshare.com/files/21855567"
# AI3_CLE2B21130054_2017-06-02.gt3x.gz
destfile = tempfile(fileext = ".gt3x.gz")
curl::curl_download(url, destfile)
path = R.utils::gunzip(destfile, remove = FALSE)
y = agcounts::agread(path = path, parser = "pygt3x")
y = tibble::as_tibble(y)
head(y)
#> # A tibble: 6 × 4
#>   time                         X     Y      Z
#>   <dttm>                   <dbl> <dbl>  <dbl>
#> 1 2017-06-02 12:00:00.000 -0.739 0.625 -0.328
#> 2 2017-06-02 12:00:00.000 -0.478 0.786 -0.176
#> 3 2017-06-02 12:00:00.000 -0.126 0.686 -0.220
#> 4 2017-06-02 12:00:00.000  0.633 0.595 -0.707
#> 5 2017-06-02 12:00:00.000  0.804 0.551 -0.809
#> 6 2017-06-02 12:00:00.000  0.806 0.677 -0.944
head(y$time)
#> [1] "2017-06-02 12:00:00 UTC" "2017-06-02 12:00:00 UTC"
#> [3] "2017-06-02 12:00:00 UTC" "2017-06-02 12:00:00 UTC"
#> [5] "2017-06-02 12:00:00 UTC" "2017-06-02 12:00:00 UTC"

We see above no subseconds reading in (using pygt3x through R)

The seconds increase at the sampling rate (30), but no other indices are different other than the second increases:

ind = which(y$X == 0)
d = diff(y$time)
head(which(d > 0))
#> [1]  30  60  90 120 150 180

Using read.gt3x uses subseconds:

y = agcounts::agread(path = path, parser = "read.gt3x")
y = tibble::as_tibble(y)
head(y)
#> # A tibble: 6 × 4
#>   time                         X     Y      Z
#>   <dttm>                   <dbl> <dbl>  <dbl>
#> 1 2017-06-02 12:00:00.000 -0.739 0.625 -0.328
#> 2 2017-06-02 12:00:00.033 -0.478 0.786 -0.176
#> 3 2017-06-02 12:00:00.066 -0.126 0.686 -0.22 
#> 4 2017-06-02 12:00:00.099  0.633 0.595 -0.707
#> 5 2017-06-02 12:00:00.133  0.804 0.551 -0.809
#> 6 2017-06-02 12:00:00.166  0.806 0.677 -0.944

Created on 2023-12-29 with reprex v2.0.2

Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.3.1 (2023-06-16) #> os macOS Monterey 12.6 #> system x86_64, darwin20 #> ui X11 #> language (EN) #> collate en_US.UTF-8 #> ctype en_US.UTF-8 #> tz America/New_York #> date 2023-12-29 #> pandoc 3.1.5 @ /usr/local/bin/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> agcounts * 0.6.4 2023-10-25 [1] Github (bhelsel/agcounts@f1cce52) #> bit 4.0.5 2022-11-15 [1] CRAN (R 4.3.0) #> bit64 4.0.5 2020-08-30 [1] CRAN (R 4.3.0) #> blob 1.2.4 2023-03-17 [1] CRAN (R 4.3.0) #> bslib 0.5.1 2023-08-11 [1] CRAN (R 4.3.0) #> cachem 1.0.8 2023-05-01 [1] CRAN (R 4.3.0) #> cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.0) #> colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.3.0) #> curl * 5.1.0 2023-10-02 [1] CRAN (R 4.3.0) #> data.table 1.14.8 2023-02-17 [1] CRAN (R 4.3.0) #> DBI 1.1.3 2022-06-18 [1] CRAN (R 4.3.0) #> digest 0.6.33 2023-07-07 [1] CRAN (R 4.3.0) #> dplyr 1.1.3 2023-09-03 [1] CRAN (R 4.3.0) #> ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.3.0) #> evaluate 0.22 2023-09-29 [1] CRAN (R 4.3.0) #> fansi 1.0.5 2023-10-08 [1] CRAN (R 4.3.0) #> fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0) #> fs 1.6.3 2023-07-20 [1] CRAN (R 4.3.0) #> generics 0.1.3 2022-07-05 [1] CRAN (R 4.3.0) #> GGIR 3.0-0 2023-10-16 [1] CRAN (R 4.3.0) #> ggplot2 3.4.4 2023-10-12 [1] CRAN (R 4.3.0) #> glue 1.6.2 2022-02-24 [1] CRAN (R 4.3.0) #> gsignal 0.3-5 2022-05-15 [1] CRAN (R 4.3.0) #> gtable 0.3.4 2023-08-21 [1] CRAN (R 4.3.0) #> htmltools 0.5.6.1 2023-10-06 [1] CRAN (R 4.3.0) #> htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 4.3.0) #> httpuv 1.6.12 2023-10-23 [1] CRAN (R 4.3.0) #> jquerylib 0.1.4 2021-04-26 [1] CRAN (R 4.3.0) #> jsonlite 1.8.7 2023-06-29 [1] CRAN (R 4.3.0) #> knitr 1.44 2023-09-11 [1] CRAN (R 4.3.0) #> later 1.3.1 2023-05-02 [1] CRAN (R 4.3.0) #> lattice 0.22-5 2023-10-24 [1] CRAN (R 4.3.0) #> lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.3.0) #> lubridate 1.9.3 2023-09-27 [1] CRAN (R 4.3.0) #> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.0) #> Matrix 1.6-1.1 2023-09-18 [1] CRAN (R 4.3.0) #> memoise 2.0.1 2021-11-26 [1] CRAN (R 4.3.0) #> mime 0.12 2021-09-28 [1] CRAN (R 4.3.0) #> munsell 0.5.0 2018-06-12 [1] CRAN (R 4.3.0) #> pillar 1.9.0 2023-03-22 [1] CRAN (R 4.3.0) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.3.0) #> png 0.1-8 2022-11-29 [1] CRAN (R 4.3.0) #> promises 1.2.1 2023-08-10 [1] CRAN (R 4.3.0) #> purrr 1.0.2 2023-08-10 [1] CRAN (R 4.3.0) #> R.cache 0.16.0 2022-07-21 [1] CRAN (R 4.3.0) #> R.methodsS3 * 1.8.2 2022-06-13 [1] CRAN (R 4.3.0) #> R.oo * 1.25.0 2022-06-12 [1] CRAN (R 4.3.0) #> R.utils * 2.12.2 2022-11-11 [1] CRAN (R 4.3.0) #> R6 2.5.1 2021-08-19 [1] CRAN (R 4.3.0) #> Rcpp 1.0.11 2023-07-06 [1] CRAN (R 4.3.0) #> reactable 0.4.4 2023-03-12 [1] CRAN (R 4.3.0) #> read.gt3x 1.2.0 2023-10-25 [1] Github (THLfi/read.gt3x@a41037a) #> reprex 2.0.2 2022-08-17 [1] CRAN (R 4.3.0) #> reticulate 1.34.0 2023-10-12 [1] CRAN (R 4.3.0) #> rlang 1.1.1 2023-04-28 [1] CRAN (R 4.3.0) #> rmarkdown 2.25 2023-09-18 [1] CRAN (R 4.3.0) #> RSQLite 2.3.1 2023-04-03 [1] CRAN (R 4.3.0) #> rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.0) #> sass 0.4.7 2023-07-15 [1] CRAN (R 4.3.0) #> scales 1.2.1 2022-08-20 [1] CRAN (R 4.3.0) #> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0) #> shiny 1.7.5.1 2023-10-14 [1] CRAN (R 4.3.0) #> stringi 1.7.12 2023-01-11 [1] CRAN (R 4.3.0) #> stringr 1.5.0 2022-12-02 [1] CRAN (R 4.3.0) #> styler 1.10.2 2023-08-29 [1] CRAN (R 4.3.0) #> tibble 3.2.1 2023-03-20 [1] CRAN (R 4.3.0) #> tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.3.0) #> timechange 0.2.0 2023-01-11 [1] CRAN (R 4.3.0) #> utf8 1.2.4 2023-10-22 [1] CRAN (R 4.3.0) #> vctrs 0.6.4 2023-10-12 [1] CRAN (R 4.3.0) #> withr 2.5.1 2023-09-26 [1] CRAN (R 4.3.0) #> xfun 0.40 2023-08-09 [1] CRAN (R 4.3.0) #> xtable 1.8-4 2019-04-21 [1] CRAN (R 4.3.0) #> yaml 2.3.7 2023-01-23 [1] CRAN (R 4.3.0) #> zoo 1.8-12 2023-04-13 [1] CRAN (R 4.3.0) #> #> [1] /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/library #> #> ─ Python configuration ─────────────────────────────────────────────────────── #> python: /Users/johnmuschelli/miniconda3/bin/python3 #> libpython: /Users/johnmuschelli/miniconda3/lib/libpython3.11.dylib #> pythonhome: /Users/johnmuschelli/miniconda3:/Users/johnmuschelli/miniconda3 #> version: 3.11.4 (main, Jul 5 2023, 08:41:25) [Clang 14.0.6 ] #> numpy: /Users/johnmuschelli/miniconda3/lib/python3.11/site-packages/numpy #> numpy_version: 1.25.2 #> pygt3x: /Users/johnmuschelli/miniconda3/lib/python3.11/site-packages/pygt3x #> #> NOTE: Python version was forced by RETICULATE_PYTHON #> #> ────────────────────────────────────────────────────────────────────────────── ```
muschellij2 commented 9 months ago

I can confirm no imputation of zeros when reading in (see jump from 28 seconds to 36 seconds). How does this affect agcounts and the temporal filtering?

y[83669:83673,]
# A tibble: 5 × 4
#   time                        X      Y      Z
#   <dttm>                  <dbl>  <dbl>  <dbl>
# 1 2017-06-02 12:46:28.000 0.610 -0.739 -0.334
# 2 2017-06-02 12:46:28.000 0.604 -0.742 -0.331
# 3 2017-06-02 12:46:36.000 0.587 -0.768 -0.906
# 4 2017-06-02 12:46:36.000 0.689 -0.733 -0.894
# 5 2017-06-02 12:46:36.000 0.944 -0.657 -0.894