paulhibbing / AGread

Read Accelerometer Files from ActiGraph Accelerometers
Other
15 stars 4 forks source link

Reading Old GT3X Format (NHANES) #10

Open muschellij2 opened 4 years ago

muschellij2 commented 4 years ago

GT3X+ (01 day)_data.zip

For previous formats (https://github.com/actigraph/NHANES-GT3X-File-Format), for example from https://help.theactigraph.com/entries/21688392-GT3X-ActiSleep-Sample-Data, or the attached GT3X+ (01 day)_data.zip

Using file from GitHub

This is an older format from https://github.com/actigraph/NHANES-GT3X-File-Format but it was uploaded for this issue

Download data

library(AGread)
#> package 'AGread' was built under R version 3.5.0
url = "https://github.com/paulhibbing/AGread/files/3653952/GT3X%2B.01.day._data.zip"
gt3x_file = tempfile(fileext = ".gt3x")
dl = download.file(url, destfile = gt3x_file)

Read in the Data

AGread::read_gt3x(file = gt3x_file)
#> Error in AGread::read_gt3x(file = gt3x_file): all(c("info.txt", "log.bin") %in% file_3x$Name) is not TRUE

Read in using read.gt3x

Here we had made a patch for the older format to make this work.

library(read.gt3x)
res = read.gt3x::read.gt3x(gt3x_file, verbose = TRUE)
#> Input is a .gt3x file, unzipping to a temporary location first...
#> Unzipping gt3x data to /var/folders/1s/wrtqcpxn685_zk570bnx9_rr0000gr/T//RtmploYK6b
#> 1/1
#> Unzipping /var/folders/1s/wrtqcpxn685_zk570bnx9_rr0000gr/T//RtmploYK6b/file93c5326b0e37.gt3x
#>  === info.txt, activity.bin, lux.bin extracted to /var/folders/1s/wrtqcpxn685_zk570bnx9_rr0000gr/T//RtmploYK6b/file93c5326b0e37
#> GT3X information
#>  $ Serial Number     :"NEO1DXXXXXXXX"
#>  $ Device Type       :"GT3XPlus"
#>  $ Firmware          :"2.5.0"
#>  $ Battery Voltage   :"4.22"
#>  $ Sample Rate       :30
#>  $ Start Date        : POSIXct, format: "2012-06-27 10:54:00"
#>  $ Stop Date         : POSIXct, format: "2012-06-28 11:54:00"
#>  $ Download Date     : POSIXct, format: "2012-06-28 16:25:52"
#>  $ Board Revision    :"4"
#>  $ Unexpected Resets :"0"
#>  $ Sex               :"Male"
#>  $ Height            :"172.72"
#>  $ Mass              :"69.8532249799612"
#>  $ Age               :"43"
#>  $ Race              :"White / Caucasian"
#>  $ Limb              :"Ankle"
#>  $ Side              :"Left"
#>  $ Dominance         :"Non-Dominant"
#>  $ DateOfBirth       :"621132192000000000"
#>  $ Subject Name      :"GT3XPlus"
#>  $ Serial Prefix     :"NEO"
#>  $ Last Sample Time  : 'POSIXct' num(0) 
#>  - attr(*, "tzone")= chr "GMT"
#>  $ Acceleration Scale:341
#> Parsing GT3X data via CPP.. expected sample size: 2700000
#> Using NHANES-GT3X format - older format
#> Sample size: 2700000
#> Scaling...
#> Lux Sample size: 2700000
#> Done (in 0.576321125030518 seconds)

Would it be possible to make one for AGread? Created on 2019-09-25 by the reprex package (v0.3.0)

Session info ``` r devtools::session_info() #> ─ Session info ────────────────────────────────────────────────────────── #> setting value #> version R version 3.6.0 (2019-04-26) #> os macOS Mojave 10.14.6 #> system x86_64, darwin15.6.0 #> ui X11 #> language (EN) #> collate en_US.UTF-8 #> ctype en_US.UTF-8 #> tz America/New_York #> date 2019-09-25 #> #> ─ Packages ────────────────────────────────────────────────────────────── #> package * version date lib #> AGread * 1.1.0.9000 2019-09-25 [1] #> assertthat 0.2.1 2019-03-21 [1] #> backports 1.1.4 2019-04-10 [1] #> callr 3.3.1 2019-07-18 [1] #> cli 1.1.0 2019-03-19 [1] #> colorspace 1.4-1 2019-03-18 [1] #> crayon 1.3.4 2017-09-16 [1] #> desc 1.2.0 2019-07-10 [1] #> devtools 2.2.0.9000 2019-09-10 [1] #> digest 0.6.21 2019-09-20 [1] #> dplyr 0.8.3 2019-07-04 [1] #> DT 0.8 2019-08-07 [1] #> ellipsis 0.3.0 2019-09-20 [1] #> evaluate 0.14 2019-05-28 [1] #> fs 1.3.1 2019-05-06 [1] #> ggplot2 3.2.1 2019-08-10 [1] #> glue 1.3.1 2019-03-12 [1] #> gtable 0.3.0 2019-03-25 [1] #> highr 0.8 2019-03-20 [1] #> htmltools 0.3.6 2017-04-28 [1] #> htmlwidgets 1.3 2018-09-30 [1] #> knitr 1.24.3 2019-08-28 [1] #> lazyeval 0.2.2 2019-03-15 [1] #> magrittr 1.5 2014-11-22 [1] #> memoise 1.1.0 2017-04-21 [1] #> munsell 0.5.0 2018-06-12 [1] #> PAutilities 0.2.0 2019-07-10 [1] #> pillar 1.4.2 2019-06-29 [1] #> pkgbuild 1.0.5 2019-08-26 [1] #> pkgconfig 2.0.3 2019-09-22 [1] #> pkgload 1.0.2 2018-10-29 [1] #> prettyunits 1.0.2 2015-07-13 [1] #> processx 3.4.1 2019-07-18 [1] #> ps 1.3.0 2018-12-21 [1] #> purrr 0.3.2 2019-03-15 [1] #> R6 2.4.0 2019-02-14 [1] #> Rcpp 1.0.2 2019-07-25 [1] #> read.gt3x * 0.1.0.9000 2019-09-19 [1] #> remotes 2.1.0 2019-06-24 [1] #> rlang 0.4.0 2019-06-25 [1] #> rmarkdown 1.15 2019-08-21 [1] #> rprojroot 1.3-2 2018-01-03 [1] #> scales 1.0.0 2018-08-09 [1] #> sessioninfo 1.1.1 2018-11-05 [1] #> stringi 1.4.3 2019-03-12 [1] #> stringr 1.4.0 2019-02-10 [1] #> testthat 2.2.1 2019-07-25 [1] #> tibble 2.1.3 2019-06-06 [1] #> tidyselect 0.2.5 2018-10-11 [1] #> usethis 1.5.1.9000 2019-08-15 [1] #> withr 2.1.2 2018-03-15 [1] #> xfun 0.9 2019-08-21 [1] #> yaml 2.2.0 2018-07-25 [1] #> source #> Github (paulhibbing/AGread@627b6d2) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> Github (muschellij2/desc@b0c374f) #> Github (r-lib/devtools@d7f0915) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> Github (muschellij2/knitr@abcea3d) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> local #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> local #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> #> [1] /Library/Frameworks/R.framework/Versions/3.6/Resources/library ```
muschellij2 commented 4 years ago

Referencing the older issue: https://github.com/THLfi/read.gt3x/issues/3 and the PR: https://github.com/THLfi/read.gt3x/pull/2

paulhibbing commented 4 years ago

It would certainly be possible, and I think it's essential for AGread to have a solution for old gt3x files. That said, I don't know if implementing a new parser is the best approach. For one thing, my timeline for turning it around would be poor. More importantly, I'm not sure how much sense it makes to re-write something that's already available in read.gt3x. (Obviously there's already quite a lot of overlap between the two, but I don't think that can/should be undone at this point.)

I think the most expedient solution would be to modify AGread::read_gt3x to check if it's an old format, and then make a call to the read.gt3x patch if it is. That would be quick and easy, and I'm happy to add read.gt3x as a remote dependency until it's on CRAN.

I'm open to other suggestions too. Thanks for reaching out.

muschellij2 commented 4 years ago

I think that's the most pragmatic approach but I think there were some issues with read.gt3x we found. Also much faster than read_gt3x (I believe due to pre-allocation) for large files. I think a collaboration may be awesome, but I don't know that author personally. Best, John

On Wed, Sep 25, 2019 at 5:58 PM paulhibbing notifications@github.com wrote:

It would certainly be possible, and I think it's essential for AGread to have a solution for old gt3x files. That said, I don't know if implementing a new parser is the best approach. For one thing, my timeline for turning it around would be poor. More importantly, I'm not sure how much sense it makes to re-write something that's already available in read.gt3x. (Obviously there's already quite a lot of overlap between the two, but I don't think that can/should be undone at this point.)

I think the most expedient solution would be to modify AGread::read_gt3x to check if it's an old format, and then make a call to the read.gt3x patch if it is. That would be quick and easy, and I'm happy to add read.gt3x as a remote dependency until it's on CRAN.

I'm open to other suggestions too. Thanks for reaching out.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/paulhibbing/AGread/issues/10?email_source=notifications&email_token=AAIGPLTWSHCGE4V4WRVQIQ3QLPNJBA5CNFSM4I2QCSNKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD7TRSVY#issuecomment-535238999, or mute the thread https://github.com/notifications/unsubscribe-auth/AAIGPLTVSXHCUS542OUJVL3QLPNJBANCNFSM4I2QCSNA .

paulhibbing commented 4 years ago

Nor do I... From what I can tell, read.gt3x looks to be written by someone who actually knows what they're doing in C++, which definitely helps. In that sense, I'm not surprised it's faster. read_gt3x is also slowed down by:

The last one allows you to skip parsing packets you don't care about, which could theoretically save you more time than it costs to categorize the packets. However, I think that probably only happens if you have both primary accelerometer and IMU data (from a GT9X device) and want to parse only one of those packet types.

muschellij2 commented 4 years ago

I think some of the speedups come from pre-allocation. I had done some modifications on the code, removing push_back commands and it seems to make the code faster I want to get_header command. I didn’t do a pull request yet because I haven’t done a sufficient amount of testing

On Fri, Sep 27, 2019 at 10:52 PM paulhibbing notifications@github.com wrote:

Nor do I... From what I can tell, read.gt3x looks to be written by someone who actually knows what they're doing in C++, which definitely helps. In that sense, I'm not surprised it's faster. read_gt3x is also slowed down by:

  • a fair number of tests/sanity checks in the backgound
  • the printing mechanism (if verbose = TRUE)
  • first leafing through the packets and categorizing them by type

The last one allows you to skip parsing packets you don't care about, which could theoretically save you more time than it costs to categorize the packets. However, I think that probably only happens if you have both primary accelerometer and IMU data (from a GT9X device) and want to parse only one of those packet types.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/paulhibbing/AGread/issues/10?email_source=notifications&email_token=AAIGPLWW4GRYTFYTRWHND2TQL3BHXA5CNFSM4I2QCSNKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD72PFUQ#issuecomment-536146642, or mute the thread https://github.com/notifications/unsubscribe-auth/AAIGPLQNDNR4J2KKD27U5OLQL3BHXANCNFSM4I2QCSNA .

-- Best, John

paulhibbing commented 4 years ago

Sounds good. My recent push may create conflicts with what you've done already -- apologies if that's the case.