Speeding up readAxivity

wadpac / GGIRread

Functions for reading accelerometer data files

https://CRAN.R-project.org/package=GGIRread

Apache License 2.0

5 stars 3 forks source link

Speeding up readAxivity #28

Closed vincentvanhees closed 1 year ago

vincentvanhees commented 1 year ago

This PR speeds up function readAxivity for reading .cwa files (discussed in #14). The new speed-up is primarily gained by how the code searches for data chunks (batches) later on in the recording.

Processing time improvements for running GGIR 2.9.1 - part 1 on:

My own AX3 test file (284 MB):

OLD: 400 seconds
NEW: 102 seconds
Difference: -75%

My own AX6 test file (164 MB):

OLD: 229 seconds
NEW: 55 seconds
Difference: -76%

UK Biobank AX3 test file (265 MB):

OLD: 350 seconds
NEW: 92 seconds
Difference: -74%

@jhmigueles Would you mind testing this on some of your own data to confirm these findings?

Thanks @egpbos for the brainstorm session this morning, as you can see it helped a lot.

codecov-commenter commented 1 year ago

Codecov Report

Merging #28 (1062360) into main (fa62993) will increase coverage by 0.11%. The diff coverage is 75.00%.

:exclamation: Current head 1062360 differs from pull request most recent head 4b8f084. Consider uploading reports for the commit 4b8f084 to get more accurate results

:exclamation: Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

@@            Coverage Diff             @@
##             main      #28      +/-   ##
==========================================
+ Coverage   87.29%   87.40%   +0.11%     
==========================================
  Files           9        9              
  Lines         897      905       +8     
==========================================
+ Hits          783      791       +8     
  Misses        114      114

Impacted Files	Coverage Δ
R/readAxivity.R	`85.41% <75.00%> (+0.41%)`	:arrow_up:

egpbos commented 1 year ago

I'd be very interested to see also how the speed-up scales with the chunk size (my guess is that with smaller chunks the speedup is even larger) and with the data file size (again, larger files should mean more chunks, so more speed up).

vincentvanhees commented 1 year ago

I'd be very interested to see also how the speed-up scales with the chunk size (my guess is that with smaller chunks the speedup is even larger) and with the data file size (again, larger files should mean more chunks, so more speed up).

Data is read block by block, with 1.2 seconds of sensor data per block, so I think this is already the minimum data length.