rgcgithub / regenie

regenie is a C++ program for whole genome regression modelling of large genome-wide association studies.
https://rgcgithub.github.io/regenie
Other
187 stars 55 forks source link

Regenie-v2.2: unexpected error for quantitative trait #146

Closed Shicheng-Guo closed 3 years ago

Shicheng-Guo commented 3 years ago

I received an unexpected error in version 2.2: ERROR: all individuals have missing/invalid values for phenotype 'X30620'.

it is a quantitative traits and 1000+ values for this trait. It should not happen, at least, for version 2.0.2 works well.

Fitting null model

regenie
  --step 1 \
  --bed /home/sguo2/ukb/Genotyped-hg19/plink/ukb_cal_allChrs \
  --extract /home/sguo2/ukb/analytic/step1/qc_pass.prune.in \
  --keep /home/sguo2/ukb/analytic/step1/qc_pass.id \
  --phenoFile /home/sguo2/data/PHENO/UKB.quantitative_traits.Guo.vRegenie.irnt.178.txt \
  --covarFile /home/sguo2/ukb/WES-hg38/UKB.cov.r2.Guo.txt \
  --covarCol PC{1:20},Year,iSex,YearSquare \
  --threads 12 \
  --strict \
  --loocv \
  --lowmem \
  --lowmem-prefix tmpdir/regenie_tmp_preds.4011497 \
  --bsize 1000 \
  --out Ukb_quantitative_traits
joellembatchou commented 3 years ago

I see you are using --strict; does it still fail if you omit this option?

bicyclic commented 2 years ago

Hi Joelle, was there a fix reported for this somewhere? I'm running into a similar issue in step 2.

The following step 1 command worked fine:

regenie
--step 1
--bed /usr/ukb_geno_chrAll_v2
--remove /usr/invalid_FIDIID.txt
--remove /usr/plinkButNotImputed_FIDIID.txt
--remove /usr/allAncestryQC_FIDIID.txt
--remove /usr/ExcludePhenoAll.txt
--exclude /usr/GenoQcAllAncestry.txt
--phenoFile /usr/fatPhenoCovarINT.txt
--phenoColList vat,asat,gfat,vatadjbmi3,asatadjbmi3,gfatadjbmi3,vatAsatRatio,vatGfatRatio,asatGfatRatio
--covarFile /usr/2021.03.11_datafreeze/fatPhenoCovarINT.txt
--covarColList age_instance2,age_instance2_sq,PC{1:10}
--catCovarList sex,genotyping_array,mriNum
--bsize 1000
--lowmem
--lowmem-prefix ./testrun/level0pred
--out ./testrun/step1_out
--gz

But the corresponding step 2 command gives me "ERROR: all individuals have missing/invalid values for phenotype 'X30620'."

regenie
--step 2
--bgen /usr/ukb_imp_chr22_v3.bgen
--remove /usr/invalid_FIDIID.txt
--remove /usr/plinkButNotImputed_FIDIID.txt
--remove /usr/allAncestryQC_FIDIID.txt
--remove /usr/ExcludePhenoAll.txt
--phenoFile /usr/fatPhenoCovarINT.txt
--phenoColList vat,asat,gfat,vatadjbmi3,asatadjbmi3,gfatadjbmi3,vatAsatRatio,vatGfatRatio,asatGfatRatio
--covarFile /usr/fatPhenoCovarINT.txt
--covarColList age_instance2,age_instance2_sq,PC{1:10}
--catCovarList sex,genotyping_array,mriNum
--bsize 1000
--pred ./testrun/step1_out_pred.list
--out ./testrun/step2_regenie
--gz

All the inputs with respect to the samples are identical to my eye, so I'm unsure about what's causing this. Thanks!

joellembatchou commented 2 years ago

Are you using the most recent version of Regenie? And can you include the log (the part showing the sample counts in files before the analysis on the variants starts)

bicyclic commented 2 years ago

In this particular run, I ran Step 1 with Regenie 2.0.2 and then attempted to run Step 2 with Regenie 2.2.4. I've copied the full log information below.

              |=============================|
              |      REGENIE v2.2.4.gz      |
              |=============================|

Copyright (c) 2020-2021 Joelle Mbatchou, Andrey Ziyatdinov and Jonathan Marchini.
Distributed under the MIT License.
Compiled with Boost Iostream library.

Log of output saved in file : ./testrun/step2_regenie_2.2.4_on_step1_regenie_2.0.2.log

Options in effect:
  --step 2 \
  --bgen /usr/ukb_imp_chr22_v3.bgen \
  --phenoFile /usr/phenoFile \
  --phenoColList traits \
  --covarFile /usr/covarFile \
  --covarColList age_instance2,age_instance2_sq,PC{1:10} \
  --catCovarList sex,genotyping_array,mriNum \
  --bsize 1000 \
  --pred ./testrun/step1_out_pred.list \
  --out ./testrun/step2_regenie_2.2.4_on_step1_regenie_2.0.2 \
  --gz 

Association testing mode with fast multithreading using OpenMP
 * bgen             : [/usr/ukb_imp_chr22_v3.bgen]
   -summary : bgen file (v1.2 layout, zlib compressed) with 487409 anonymous samples and 1255683 variants with 8-bit encoding.
   -index bgi file [/usr/ukb_imp_chr22_v3.bgen.bgi]
 * phenotypes       : [/usr/phenoFile] n_pheno = 6
ERROR: all individuals have missing/invalid values for all traits.
pottj commented 2 years ago

Hi there, I ran into the same problem in step 2 (using v2.2.4). I used the same phenotype file for step 1 & step 2. REGENIE_step2.log

pottj commented 2 years ago

Update: I added the --sample flag and step2 finished without error. So I guess --bgen always needs its corresponding sample file within the options?

bicyclic commented 2 years ago

Hi pottj,

Yes I overlooked that! I added --sample back in and step2 finished without error. Thank you both!

Best, bicyclic

HackerLZH commented 7 months ago

Update: I added the --sample flag and step2 finished without error. So I guess --bgen always needs its corresponding sample file within the options?

Good!