In editing Greg Smith's QC normalization code, I discovered a little wrinkle when I tried to upgrade his GCP read-ins to the latest functions: dl_read_gcp uses data.table::fread to import data. It turns out that if you use this to read in the EQC, or another file that has vialLabel as a column, data.table represents that vector as integer64 because the values are >2^31. I don't really understand why it does this instead of making it numeric or character, since R explicitly represents integers as 32-bit, but this led to problems because Windows and Linux versions of R apparently handle this vector differently. Specifically, Windows failed to execute Greg's code, but Linux succeeded. Greg got it to work because he was reading in GCP files manually, and must have gotten it to read the column in some other way. Hence why no one noticed, since so few of us run Windows, but it definitely led to problems. For now, I was able to address the problem by manually converting the column to "character" after it gets read in by dl_read_gcp.
Two solutions:
The faster option add an argument to the data.table::fread in dl_read_gcp that sets intger64 = "numeric" or integer64 = "character". Personally I think vialLabel behaves better as a character, since it's an identifier, not a value, but integer64 = "character" behaves really oddly and makes missing values blank rather than NA.
The better option (IMHO) is to swap over to tidyverse and make all the read ins use readr::read_delim, though this will could cause some code to need further updates.
In editing Greg Smith's QC normalization code, I discovered a little wrinkle when I tried to upgrade his GCP read-ins to the latest functions:
dl_read_gcp
usesdata.table::fread
to import data. It turns out that if you use this to read in the EQC, or another file that hasvialLabel
as a column,data.table
represents that vector as integer64 because the values are >2^31. I don't really understand why it does this instead of making it numeric or character, since R explicitly represents integers as 32-bit, but this led to problems because Windows and Linux versions of R apparently handle this vector differently. Specifically, Windows failed to execute Greg's code, but Linux succeeded. Greg got it to work because he was reading in GCP files manually, and must have gotten it to read the column in some other way. Hence why no one noticed, since so few of us run Windows, but it definitely led to problems. For now, I was able to address the problem by manually converting the column to "character" after it gets read in bydl_read_gcp
.Two solutions:
data.table::fread
indl_read_gcp
that setsintger64 = "numeric"
orinteger64 = "character"
. Personally I thinkvialLabel
behaves better as a character, since it's an identifier, not a value, butinteger64 = "character"
behaves really oddly and makes missing values blank rather thanNA
.readr::read_delim
, though this will could cause some code to need further updates.