wadpac / GGIR

Code corresponding to R package GGIR
https://wadpac.github.io/GGIR/
Apache License 2.0
94 stars 60 forks source link

Identification of nights and days to exclude in data_cleaning_File #1168

Closed jhmigueles closed 2 months ago

jhmigueles commented 2 months ago

Line 197 in g.report.part4 tries to identify the nights that should be excluded from the nightsummary clean report by matching IDs and night numbers:

days2exclude = which(nightsummary$ID %in% DaCleanFile$ID & nightsummary$night %in% DaCleanFile$night_part4)

The way in which the matching is done is problematic as it looks separately for IDs and nights. For example, if the data_cleaning_file includes ID = 1 and night = 6, every night with ID == 1 would be excluded (even if night != 6), and every night == 6 too (even if ID != 1). See example here:

# Artificial data cleaning file indicating 4 nights to remove (1 from each recording)
DaCleanFile = data.frame(ID = 1:4,
                         night_part4 = 4:1)

# Artificial nightsummary, including 4 recordings of 4 nights each
nightsummary = data.frame(ID = rep(1:4, each = 4),
                          night = rep(1:4, times = 4))

# Line 197 in g.report.part4
days2exclude = which(nightsummary$ID %in% DaCleanFile$ID & nightsummary$night %in% DaCleanFile$night_part4)

# We would expect 4 nights to be removed from nightsummary, but we got 16 (all nights)
length(days2exclude) # 16