wadpac / GGIR

Code corresponding to R package GGIR
https://wadpac.github.io/GGIR/
Apache License 2.0
94 stars 60 forks source link

Minimum sample frequency #1163

Closed nkissel closed 2 months ago

nkissel commented 2 months ago

Describe the bug When reading in data from a custom csv format that contains minute-by-minute data (rather than in the original sampling frequency), a memory error occurs due to the creation of a large matrix. It appears the code assumes a minimum sample frequency. Is this correct?

To Reproduce Read in summarized data that provides minute-by-minute movement, temperature, and light, rather than in the original sampling frequency. Data were originally collected with sampling frequency of 50Hz, but data were exported to minute frequency (ie 1/60Hz). Inputting a very small frequency results in Error: vector memory exhausted (limit reached?) error due to line 117 in g.calibrate().

  1. Sensor brand: GENEActiv
  2. Data format: "%Y-%m-%d %H:%M:%S"
  3. Approximate recording duration: 14 days
  4. Are you using a sleep diary to guide the sleep detection: NO
  5. Copy of R command used:

    GGIR(
    do.enmo = T,
    acc.metric = "ENMO",
    windowsizes = c(5, 900, 5400),
    mode=c(1,2,3,4,5),
    datadir = "~/GENEActiv3/onef/trunc/223761_20.csv",  # CUSTOM CSV
    
    studyname = "study1",
    rmc.file = "~/GENEActiv3/onef/trunc/223761_20.csv",
    rmc.nrow = Inf,
    rmc.skip = 0,
    rmc.dec = ".",
    rmc.firstrow.acc = 41,
    rmc.col.time = 1,
    rmc.col.acc = 2:4,
    rmc.col.temp = 7,
    rmc.unit.acc = "g",
    rmc.unit.temp = "C",
    rmc.unit.time = "POSIX",
    rmc.format.time = "%Y-%m-%d %H:%M:%S:500",
    desiredtz = "America/New_York",
    rmc.firstrow.header = 1,
    rmc.header.length = 33,
    rmc.headername.sn = "Device Unique Serial Code",
    rmc.headername.recordingid = "Subject Code",
    rmc.sf = 1/60, # LOW SAMPLING FREQUENCY due to data being on minute scale
    minimumFileSizeMB = 0.2,
    
    outputdir=outputdir,
    do.report=c(2,4,5), 
    nonwear_approach = "2013",
    overwrite = T,
    nonWearEdgeCorrection = T, 
    
    strategy = 1,                  
    hrs.del.start = 0,             
    hrs.del.end = 0,               
    maxdur = 28,                   
    includedaycrit = 0,            
    qwindow=c(0,24),               
    bout.metric = 6,               
    excludefirstlast = FALSE,      
    includenightcrit = 0,          
    epochvalues2csv = TRUE,
    do.imp = T,               
    
    sleepwindowType = "SPT", 
    def.noc.sleep = c(),           
    outliers.only = FALSE,         
    HASPT.ignore.invalid = NA,
    HASIB.algo = "ColeKripke1992", 
    ignorenonwear = T,
    do.visual = F,            
    
    Sadeh_axis = 'Y',
    save_ms5rawlevels=TRUE,     
    save_ms5raw_format="csv",   
    save_ms5raw_without_invalid = FALSE,
    part5_agg2_60seconds=TRUE, 
    minimum_MM_length.part5 = 1,
    timewindow = c('WW', 'OO'),       
    visualreport=F
    )
  6. Have you tried processing your data based on GGIR's default argument values? Does the issue you report still appear? YES / NO

Expected behavior Not to throw Error: vector memory exhausted (limit reached)

vincentvanhees commented 2 months ago

The code expects you to provide raw data. Minute-by-minute data can never be raw data, and is most likely data pre-processed by other software. GGIR facilitates some specific formats for non-raw data, e.g. from Actiwatch and ActiGraph.

I am closing this issue now, but feel free to re-open if you think I misunderstood something.