yukt / MetaMinimac2

11 stars 4 forks source link

Segmentation Fault when Meta-imputing #2

Open moefrumkin opened 2 years ago

moefrumkin commented 2 years ago

I am attempting to meta-impute the results of two different imputations with the following command: MetaMinimac2 -i imputed/MEGA_TPMD/38/vcf/chr${chrom}:imputed/MEGA_KG38/38/vcf/chr${chrom} -o metaimputed/AB/chr${chrom}

However, when I run MetaMinimac2, the program always exits with a Segmentation Fault:

 ------------------------------------------------------------------------------ 
             MetaMinimac2 -- An Efficient Tool for Meta-Imputation    
 ------------------------------------------------------------------------------
 (c) 2019 - Ketian Yu, Sayantan Das, Goncalo Abecasis 
 Version : 1.0.0;
 Built   : Thu Jul 14 12:58:08 EDT 2022 by frumkin

 URL = http://genome.sph.umich.edu/wiki/MetaMinimac2
 GIT = https://github.com/yukt/MetaMinimac2.git

 Command Line Options: 
      --input [imputed/MEGA_TPMD/38/vcf/chr21:imputed/MEGA_KG38/38/vcf/chr21],
      --output [metaimputed/AB/chr21],
      --format [GT,DS,HDS],
      --skipPhasingCheck [OFF],
      --skipInfo [OFF],
      --nobgzip  [OFF],
      --weight   [OFF],
      --log      [OFF].

 ------------------------------------------------------------------------------
                             INPUT VCF DOSAGE FILE                             
 ------------------------------------------------------------------------------

 Number of Studies : 2
 -- Study 1 Prefix : imputed/MEGA_TPMD/38/vcf/chr21
 -- Study 2 Prefix : imputed/MEGA_KG38/38/vcf/chr21

 Checking sample compatibility across files ... 
 -- Found 1803 samples (3606 haplotypes).

 Scanning input empirical VCFs for commonly typed SNPs ... 
 -- Study 1 #Genotyped Sites = 10510
 -- Study 2 #Genotyped Sites = 10559
 -- Found 10339 commonly genotyped! 
 -- Successful (8 seconds) !!!

 Checking phasing consistency across input files ... 
 -- Consistent among commonly genotyped sites.
 -- Completed in 25 seconds.

 ------------------------------------------------------------------------------
                               WEIGHT ESTIMATION                               
 ------------------------------------------------------------------------------

 Estimate Meta-Weights for Sample 1-1000 [55.5%] ...
 -- Loading Empirical Dosage Data ...
 -- Calculating Weights ... 
 -- Successful (91 seconds) !!! 

 Estimate Meta-Weights for Sample 1001-1803 [100.0%] ...
 -- Loading Empirical Dosage Data ...
 -- Calculating Weights ... 
 -- Successful (72 seconds) !!! 

 Appending to final output weight file : metaimputed/AB/chr21.metaWeights.gz
 -- Successful (73 seconds) !!!

 Weight Estimation Completed in 0 hours, 3 mins, 56 seconds.

 ------------------------------------------------------------------------------
                               FINAL ANALYSIS                               
 ------------------------------------------------------------------------------
/var/spool/uger-8.5.5/uger-c071/job_scripts/32232583: line 5: 11999 Segmentation fault      (core dumped)

Is there any way I can resolve this issue?

phb16 commented 2 years ago

I am having the same issue imputing to TOPMED and 1000 Genomes b38 30x. I receive a segmentation fault (core dumped) error if I use skipInfo OFF. If I use skipInfo ON I'm able to run, but I am getting negative dosages and dosages > 2.

mohadesesd commented 1 year ago

I keep receiving segmentation fault for the X chromosome only. I am not sure why this is happening only for X chromosome. I checked the code and it doesn't seem to do any different for special requirements fro chromosome X.

LeeDarian commented 1 year ago

I keep receiving segmentation fault for the 15 chromosome only

yukt commented 1 year ago

Could you send me an toy example file so I can debug on my end?

On Fri, Sep 1, 2023 at 9:09 AM LeeDarian @.***> wrote:

I keep receiving segmentation fault for the 15 chromosome only

— Reply to this email directly, view it on GitHub https://github.com/yukt/MetaMinimac2/issues/2#issuecomment-1702722925, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD6UVLMMUG46V2B6QQH32ATXYHM2XANCNFSM53TBSVNQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

LeeDarian commented 1 year ago

Thanks for your reply !. I use the Michigan Imputation Server. My input data is based on hg38 and I phasing through Eagle2 using my own data, and then I imputation using asia-panel(hg19) and hrc-panel(hg19) respectively.

My code: MetaMinimac2 -i ./Michigan_ASIA/ASA/data/chr${i}:./Michigan_HRC/ASA/data/chr${i} -o ./Meta/ASA/ASIA_HRC.meta.chr${i}  --skipInfo --weight --log

When I didn't use the --skipInfo parameter, the output file got an error. When I added the --skipInfo parameter, it worked fine, but chromosome 15 had the following error:

Program received signal SIGSEGV, Segmentation fault. 0x0000000000427b0d in HaplotypeSet::ReadBasedOnSortCommonGenotypeList(std::vector<std::string,  std::allocatorstd::string >&, int, int) ()

If you let them merge with themselves separately, like this: MetaMinimac2 -i ./Michigan_ASIA/ASA/data/chr${i}:./Michigan_ASIA/ASA/data/chr${i} -o ./Meta/ASA/ASIA_HRC.meta.chr${i}  --skipInfo --weight --log Or so MetaMinimac2 -i ./Michigan_HRC/ASA/data/chr${i}:./Michigan_HRC/ASA/data/chr${i} -o ./Meta/ASA/ASIA_HRC.meta.chr${i}  --skipInfo --weight --log The software works, so i think it's probably not the vcf file format, and the files are all generated by Michigan Imputation Server.

------------------ 原始邮件 ------------------ 发件人: "yukt/MetaMinimac2" @.>; 发送时间: 2023年9月1日(星期五) 晚上9:37 @.>; @.**@.>; 主题: Re: [yukt/MetaMinimac2] Segmentation Fault when Meta-imputing (Issue #2)

Could you send me an toy example file so I can debug on my end?

On Fri, Sep 1, 2023 at 9:09 AM LeeDarian @.***> wrote:

> I keep receiving segmentation fault for the 15 chromosome only > > — > Reply to this email directly, view it on GitHub > <https://github.com/yukt/MetaMinimac2/issues/2#issuecomment-1702722925&gt;, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AD6UVLMMUG46V2B6QQH32ATXYHM2XANCNFSM53TBSVNQ&gt; > . > You are receiving this because you are subscribed to this thread.Message > ID: @.***> >

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

从QQ邮箱发来的超大附件

chr15.empiricalDose_ASIA.vcf.gz (36.67M, 到期)进入下载页面:http://mail.qq.com/cgi-bin/ftnExs_download?t=exs_ftn_download&k=66316662c860d49fa948085d1e35534d4a455451010c05071f0755500118055305084b000001524f05075e570a01055a03500204382a61015a43575716500c125b430f015959250d415439236b7c204c4452004c5f4f615f&code=21fb85ab

chr15.dose_HRC.vcf.gz (1.48G, 2023年10月01日 23:02 到期)进入下载页面:http://mail.qq.com/cgi-bin/ftnExs_download?t=exs_ftn_download&k=2c36666417caa994fd4f085b14620b1a1e425651500358034b04045c0b4f5c0500004b565307011850535705035a0d0303505e55327739560e4457511c06564603692e36714c4f560018011e325f&code=f6fd2b95

chr15.empiricalDose_HRC.vcf.gz (29.56M, 到期)进入下载页面:http://mail.qq.com/cgi-bin/ftnExs_download?t=exs_ftn_download&k=6739373455ce10cfaf40590b4534041e4c4d54025b5202521958055607190357525d1a045251061c0100005255510e025109540c632a36525c4b06014d515b415d4b5e570258725e475c687c31771847575f195319340b&code=4974c461

chr15.dose_ASIA.vcf.gz (1.78G, 2023年10月01日 23:12 到期)进入下载页面:http://mail.qq.com/cgi-bin/ftnExs_download?t=exs_ftn_download&k=7e3738663d8c6fb6ae4e5659153054174d430c045202525a18055e02051d560e5305155f010256150756085005015f0805045e003326665b5d4509531d54094b506879357a71484e5651160149305b&code=578f30f8

LeeDarian commented 1 year ago

I keep receiving segmentation fault for the X chromosome only. I am not sure why this is happening only for X chromosome. I checked the code and it doesn't seem to do any different for special requirements fro chromosome X.

Did you finally solve the problem?

JeffreyTsaiCW commented 6 months ago

I got the same "Segmentation fault" issue when doing meta-imputation without --skipInfo. I really need the Rsq information to evaluate the results. Did anyone solve the problem? Thanks!

dtaliun commented 3 months ago

Hi,

For those experiencing a segmentation fault error when meta-imputing chromosome X, remember to meta-impute PAR and non-PAR regions separately.

You will get segmentation fault if you try meta-impute chromosome X without splitting PAR and non-PAR regions because MetaMinimac2 determines the ploidy of samples based on the first entry in the input VCFs and will likely end up treating all samples as diploid across the entire chromosome X.

The problematic code is in void HaplotypeSet::LoadCurrentGT(VcfRecordGenotype & ThisGenotype) function. Would be great to throw a user friendly error inside this function when number of tokens in GT field doesn't match SampleNoHaplotypes[i] value.

P.S. Of course this doesn't apply if you have only samples who are diploid on chrX in the study or chrX haploid samples who were coded as homozygous diploids in VCF.