Closed eric-czech closed 3 years ago
The whole file finished processing overnight and this error only occurs for two fields:
gsutil cat gs://rs-ukb/prep/main/log/phesant/phesant.log | grep -C 10 Error
[1] "ERROR: 670_0 Error in `[.data.frame`(x, order(x, na.last = na.last, decreasing = decreasing)): undefined columns
[1] "ERROR: 1220_0 Error in `[.data.frame`(x, order(x, na.last = na.last, decreasing = decreasing)): undefined columns
There are no obvious differences between them and the second field (1220) has at least one occurrence of all values in its data coding (so that theory is wrong).
These edits are being used to ignore these errors: https://github.com/eric-czech/PHESANT/commit/05997a79c734a0706f7622e8c9c734984f1da130
The rest of the data appears to be fine so these two fields will be omitted going forward.
Filtering the input phenotype data for PHESANT using sample ids that pass genetic data QC results in an error like this:
Dumping this environment and debugging it shows that the data for the failed field (670) may not contain one of the values in the encoding:
Field: https://biobank.ctsu.ox.ac.uk/crystal/field.cgi?id=680 Encoding: https://biobank.ctsu.ox.ac.uk/crystal/coding.cgi?id=100287