Closed baozg closed 5 months ago
Hi,
Thanks for brining up this issue. It'll be a bit tricky to debug this without having access to the files. Would it be possible to share the input files so we can try to reproduce this? Thanks!
Could you give me an email address and then I send you link of this chromosome data?
Sure thing! You can send me the files at lucasbrambrink@google.com
Additionally, Seg faults can sometimes happen from OOMs (running out of memory). Do you have the memory specs of the instance you are running this on? Thanks!
It was run with 256G RAM node and all other samples finish in the same RAM nodes. I will send you data later
From: Lucas Brambrink @.> Sent: Tuesday, March 26, 2024 6:46:34 PM To: google/deepvariant @.> Cc: Zhigui Bao @.>; Author @.> Subject: Re: [google/deepvariant] Fatal Python error: Segmentation fault (Issue #794)
Sure thing! You can send me the files at @.**@.>
Additionally, Seg faults can sometimes happen from OOMs (running out of memory). Do you have the memory specs of the instance you are running this on? Thanks!
— Reply to this email directly, view it on GitHubhttps://github.com/google/deepvariant/issues/794#issuecomment-2021094547, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AE5Y3VRRFLRCYTPDYDZDFY3Y2GX7VAVCNFSM6AAAAABFG7OINSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMRRGA4TINJUG4. You are receiving this because you authored the thread.Message ID: @.***>
I also encountered the same problem, may I ask if it has been solved now? How to solve it?
@baozg and @yangxin-9 ,
Additionally to sending the bam files, can you please also see if the files are not truncated? You can run the following command to check if the files are OK:
samtools quickcheck -v *.bam > bad_bams.fofn && echo 'all ok' || echo 'some files failed check, see bad_bams.fofn'
OK. I'll try that. Thank you for your reply.
I have checked my bam file according to the command you gave and it shows that 'all ok'. The error may not be caused by the bam file.
@baozg
After carefully bisecting your BAM file, it looks like the region that throws an error is chr12:7721068-7735636.
Looking at the pileup, there are 5 large (~11k) deletions in that region of 3 different lengths:
One is length 11,843
, two are 11,844
and two are 11,845
. It looks like the trouble comes from attempting to represent and realign those INDEL candidates with 2 reads each. DeepVariant can't actually call deletions that long.
If you set the vsc_min_count_indel to 3, the problem goes away. So adding --make_examples_extra_args=vsc_min_count_indels=3
should fix the issue. If desired, you can run DeepVariant on just that region with --regions=chr12:7721068-7735636
We will work on fixing this on our end as well in our next release.
@yangxin-9 To avoid mixing issues may or may not be related, please create a new issue that shows the command you ran and the output. Also, if possible, please send us the input files used so we can try to reproduce the issue ourselves.
Thanks for your careful examination. It's quite common to see this divergent region in outcrossing plants. It mixed with mapping noise and true variants. Is it possible to report this region / reads when realign fails? Or do I need pre exclude this region before DeepVariant calling?
Right now DeepVariant does not have the ability to report such a region by itself and skip it. You will need to exclude the problematic regions before running DeepVariant, or use vsc_min_count_indels
to avoid candidate generation in these cases.
Thank you so much. Now this sample runs smoothly.
Have you checked the FAQ? https://github.com/google/deepvariant/blob/r1.6.1/docs/FAQ.md: Yes
Describe the issue: (A clear and concise description of what the issue is.)
Fatal Python error: Segmentation fault when make_examples
Setup
seqtk -X 5
with one fasta. It worked with 30 samples, but one chromosome of one sample cannot finished with this errorSteps to reproduce:
chr=$3 indir="01.mapping" outdir="02.snps" sif="dv-1.6.0.sif"
singularity exec -B ${indir}:/input -B ${outdir}:/output ${sif} /bin/bash -c "/opt/deepvariant/bin/run_deepvariant --model_type PACBIO --ref /input/ref.fa --reads /input/${sample}.sorted.bam --regions chr${chr} --output_vcf=/output/${sample}.chr${chr}.vcf.gz --output_gvcf=/output/${sample}.chr${chr}.g.vcf.gz --intermediate_results_dir=/output/${sample}_chr${chr} --num_shards=${threads} --sample_name=${sample}" rm -rf ${outdir}/${sample}_chr${chr}
Does the quick start test work on your system? Please test with https://github.com/google/deepvariant/blob/r1.6/docs/deepvariant-quick-start.md. Is there any way to reproduce the issue by using the quick start?
Any additional context: