privefl / bigstatsr

R package for statistical tools with big matrices stored on disk.
https://privefl.github.io/bigstatsr/
179 stars 30 forks source link

big_univLinReg does not support missing X values? #136

Closed nettam closed 2 years ago

nettam commented 3 years ago

I have an FBM generated from a bed with missing values (not all snps passed QC for all samples). I would like to run a GWAS, figured using big_univLinReg would do the work, but get an error message that X contains missing values. Not sure why this would be a problem (could be ignored per snp, other packages support this.) Other than running imputation prior to GWAS, is there any way to run this analysis using bigstatsr?

Thanks

privefl commented 3 years ago

Yes, most functions in both packages don't handle missing values. This is not as straightforward to handle missing values for other functions, and it generally prevents from doing some linear algebra tricks. So, I have decided not to go down that road.

You should read from imputed data, or impute (see e.g. https://privefl.github.io/bigsnpr-extdoc/preprocessing.html#imputation).