Google-Health / genomics-research

122 stars 32 forks source link

ValueError for non-integer FIDs / IIDs #1

Closed demodw closed 3 years ago

demodw commented 3 years ago

Hi,

I'm encountering the following error when trying to run DeepNull:

ValueError: invalid literal for int() with base 10 (see screenshot)

The offending line seems to be https://github.com/Google-Health/genomics-research/blob/6af3a4bb152fe7902e6d36c207044e1b94540c05/nonlinear-covariate-gwas/data.py#L172.

I guess it's a remnant from all person IDs in the UK Biobank being integers. I suspect I'm not the only one who will run into this problem, though, so it might be worthwhile to allow for non-integer FIDs/IIDs.

image

cmclean commented 3 years ago

Thank you for this issue report! Indeed, the existing code assumed all fields are numeric. A workaround with the existing code is to pass the cast_ints=False flag to the write_plink_or_bolt_file function, but I've just pushed https://github.com/Google-Health/genomics-research/commit/e8dfa6ac4947e3760a77bc004254929444a6fcea which fixes the issue. An updated pip package (v0.1.3) is also available on PyPI.