Griffan / VerifyBamID

VerifyBamID2: A robust tool for DNA contamination estimation from sequence reads using ancestry-agnostic method.
http://griffan.github.io/VerifyBamID/
92 stars 15 forks source link

Segmentation fault #38

Closed jianbiaoli closed 2 years ago

jianbiaoli commented 2 years ago

I tried to generated my own resource files. I kept getting the following error. Could you help me on this issue?

Segmentation fault VerifyBamID.Linux.x86-64.v2.0.1 --RefVCF 1kgp.chm13.v1.1.snp.100k.vcf.gz --Reference chm13v2.0_maskedY_rCRS.fa

Best, Jianbiao Li

Griffan commented 2 years ago

Sure, I need more information to debug: 1) can you see the help message if you don’t provide any arguments? 2) if yes, could you briefly describe the content of your VCF and fa, dimensions and chromosome names etc. 3) if no, could you provide more information about your environment

jianbiaoli commented 2 years ago

1.Yes. I can see the help message.

$ /path/VerifyBamID.Linux.x86-64.v2.0.1
VerifyBamID2: A robust tool for DNA contamination estimation from sequence reads using ancestry-agnostic method.

 Version:2.0.1
 Copyright (c) 2009-2020 by Hyun Min Kang and Fan Zhang
 This project is licensed under the terms of the MIT license.

The following parameters are available.  Ones with "[]" are in effect:

Available Options
                    Input/Output Files : --BamFile [Empty],
                                         --PileupFile [Empty],
                                         --Reference [Empty],
                                         --SVDPrefix [Empty],
                                         --Output [result]
               Model Selection Options : --WithinAncestry,
                                         --DisableSanityCheck, --NumPC [2],
                                         --FixPC [Empty],
                                         --FixAlpha [-1.0e+00],
                                         --KnownAF [Empty], --NumThread [4],
                                         --Seed [12345], --Epsilon [1.0e-08],
                                         --OutputPileup, --Verbose
   Construction of SVD Auxiliary Files : --RefVCF [Empty]
                        Pileup Options : --min-BQ [13], --min-MQ [2],
                                         --adjust-MQ [40], --max-depth [8000],
                                         --no-orphans, --incl-flags [1040],
                                         --excl-flags [1796]
                    Deprecated Options : --UDPath [Empty], --MeanPath [Empty],
                                         --BedPath [Empty]

FATAL ERROR -
--UDPath is required when --RefVCF is absent

Exiting due to ERROR:
        Exception was thrown
  1. The raw vcf and fa files were downloaded from https://github.com/marbl/CHM13. The chromosome names are like "chr1, chr22...", and are consistent across the vcf and fa. The subseted vcf has ~130k snps and 3202 samples. I also tried 50k, 10k and 1k snps, but all got the same error.
3.0G    chm13v2.0_maskedY_rCRS.fa
2.4G    1kgp.chm13.v1.1.snp.100k.vcf.gz
183M    1kgp.chm13.v1.1.snp.10k.vcf.gz

$bcftools stats 1kgp.chm13.v1.1.snp.100k.vcf.gz
# SN    [2]id   [3]key  [4]value
SN      0       number of samples:      3202
SN      0       number of records:      131206
SN      0       number of no-ALTs:      0
SN      0       number of SNPs: 131206
SN      0       number of MNPs: 0
SN      0       number of indels:       0
SN      0       number of others:       0
SN      0       number of multiallelic sites:   0
SN      0       number of multiallelic SNP sites:       0

$bcftools stats 1kgp.chm13.v1.1.snp.10k.vcf.gz
# SN    [2]id   [3]key  [4]value
SN      0       number of samples:      3202
SN      0       number of records:      11000
SN      0       number of no-ALTs:      0
SN      0       number of SNPs: 11000
SN      0       number of MNPs: 0
SN      0       number of indels:       0
SN      0       number of others:       0
SN      0       number of multiallelic sites:   0
SN      0       number of multiallelic SNP sites:       0

Best, Jianbiao Li

Griffan commented 2 years ago

Could you share with me your 1k snps vcf?

jianbiaoli commented 2 years ago

I think I find out the reason. "0|0", ".|.", "./.:0,0:.:.:.:.:.:." and "./.:0,0:.:.:." in genotype fields are not allowed for VerifyBamID. It successed after I removed those problematic lines.

Best, Jianbiao Li

Griffan commented 2 years ago

Thanks, this is really helpful! I will try to draft a patch to explicitly print this error.