junyangq / snpnet

snpnet: Fast and scalable lasso/elastic-net solver for large SNP data
32 stars 15 forks source link

Error: Problem with `mutate()` column `stats_means`. #39

Open julibeg opened 3 years ago

julibeg commented 3 years ago

When trying to run snpnet, computeStats fails with

Error: Problem with `mutate()` column `stats_means`.
ℹ `stats_means = (HAP_ALT_CTS + HET_REF_ALT_CTS + 2 * TWO_ALT_GENO_CTS)/OBS_CT`.
✖ non-numeric argument to binary operator

My guess is that this is because of the presence of joined multiallelic sites causing PLINK to write cells with comma-separated numbers to some of the columns of the output of --geno-counts. In order to avoid the issue, I split the merged variants. However, then snpnet fails with

Error: --read-freq variant ID 'rs112422003' appears multiple times in main

So it looks like the user needs to make sure that the variants are split and have unique IDs. Is that correct? If so, then it would be nice if you could add this to the documentation somewhere.