Discrepencies of timestamp between GGIRread of GENEActiv and GENEActiv software

pinweichen commented 6 months ago

Hi there,

I recently compared the converted GENEActiv CSV file produced from GGIRread::GENEActivReader and GENEActiv Software (provided by the GENEActiv engineers). I notice there are some discrepancies between them. The GGIRread file contains periodically repeated timestamps (making the sampling frequency 0). The timestamp gradually deviated from the file from GENEActiv software. I can provide an example of data (.bin file and the csv converted from GENEActiv software) to look into this. I look at the code in GENEActivReader and find this readGENEActiv(). I wasn't sure what was this function. Can you help me troubleshoot what can be the reason for these discrepancies? Thank you.

Benny

vincentvanhees commented 6 months ago

I am not sure comparing CSV exports is informative, because GENEActiv Software may do its own resampling or interpolation. If those are not open source it will be hard to understand let alone replicate them.

When we developed GGIRread we used GENEAread::read.bin to check for consistency. So, best test would be to compare GGIRread::GENEActivReader with GENEAread::read.bin inside R without exporting to CSV.

pinweichen commented 6 months ago

Hi Dr. van Hees,

I've tested the problematic .bin file with GENEAread::read.bin and GGIRread::GENEActivReader within the R. I found the timestamp error occurs only from the data loaded from GGIRread::GENEActivReader. The error only occurred in 0.33% of the file (out of 12960000 rows of data sampling at 50 Hz). I've attached a screenshot of the error. The figure shows the error data row and the row after the error row. If you look at the timestamp, it shows that the timestamp has repeated. I can provide the example .bin file as well if you are interested. The ts_lag_diff is the column that calculates the timestamp differences (shift lead) in consecutive rows. The sf column indicates the corresponding sampling frequency calculated from the ts_lag_diff. Please let me know if there are settings I can try to troubleshoot. Thank you so much for your respond. Appreciate your help.

vincentvanhees commented 6 months ago

Thanks, could you send me the example .bin file to v.vanhees@accelting.com?

vincentvanhees commented 6 months ago

Thanks for sending the file, I see the double timestamp always happens at the start of each new 300 samples page. Your screenshot gives the impression that it happens every other sample, but yes this indicates a bug.

I am now preparing a fix.

vincentvanhees commented 6 months ago

... just to add: For GGIR users this will have no impact as GGIR only uses the first timestamp in the file and then extrapolates based on sampling rate.

vincentvanhees commented 6 months ago

I will submit to CRAN soon, but if you happen to have your R environment configured to build cpp code then you could test it out with: remotes::install_github("wadpac/GGIRread", ref = "issue61_increamenting_time_GENEActiv")

pinweichen commented 6 months ago

Thank you, Dr. van Hees. I willtest the new version.

Regarding your comment on GGIR usage of GENEActiv data, do the extrapolates replace the timestamps assuming each row is sampled at a fixed frequency? Or does GGIR conduct imputation using linear or spline interpolation? Does this extrapolation step occur to all data in part 1? (I noticed that ActiGraph has a special preprocessing due to the potential idle time settings).

Best, Benny

On Mon, Mar 25, 2024 at 1:02 PM Vincent van Hees @.***> wrote:

I will submit to CRAN soon, but if you happen to have your R environment configured to build cpp code then you could test it out with: remotes::install_github("wadpac/GGIRread", ref = "issue61_increamenting_time_GENEActiv")

— Reply to this email directly, view it on GitHub https://github.com/wadpac/GGIRread/issues/61#issuecomment-2018480944, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFJBXNIIMWIF7YI533LSOIDY2BKCBAVCNFSM6AAAAABEYUOF4GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMJYGQ4DAOJUGQ . You are receiving this because you authored the thread.Message ID: @.***>

vincentvanhees commented 6 months ago

GGIR, with help from GGIRread and read.gt3x, processes each data format tailored to known issues for that data format:

For GENEActiv we assume constant sampling rate as I found that approach empirically more reliable than trusting the timestamps.
For ActiGraph idle sleep mode is taken into account.
For Axivity timestamps are resampled and faulty data blocks are imputed.

wadpac / GGIRread

Discrepencies of timestamp between GGIRread of GENEActiv and GENEActiv software #61