gymreklab / STRDenovoTools

Toolkit for calling and analyzing de novo STR mutations
GNU General Public License v3.0
13 stars 4 forks source link

ERROR: Failed to extract string FORMAT value from VCF record #17

Open atyryshkina opened 3 years ago

atyryshkina commented 3 years ago

I ran into this error when running monSTR on VCFs created by gangSTR. I was able to trace back the error--MonSTR was looking for a field called ENCLREADS in the VCF file, and the version of GangSTR I used didn't output this field. I was using GangSTR v. 2.4 that's included in the docker image gymreklab/str-toolkit (https://hub.docker.com/r/gymreklab/str-toolkit). Switching over to GangSTR v. 2.5 fixed the problem.

I didn't see anything about this in the Github README page and the error message was vague enough that I thought I should post an issue.

Anastasia

GangSTR Command

singularity exec durga_cache/str-toolkit.simg  \
    GangSTR \
        --bam $bamfiles \
        --ref $fasta \
        --regions hg19_ver13_1.bed \
        --out $tmp_dir/$family \
        --include-ggl 

MonSTR command

MonSTR \
     --strvcf $vcf \
     --fam $ped \
     --gangstr \
     --out $out \
     --region chr19 

Error message

[MonSTR-2.0] ProgressMeter: PedigreeSet has 1 nuclear families with STR data. Unaffected children: 2 Affected children: 0 Unknown children: 0 [MonSTR-2.0] ProgressMeter: Running de novo analysis... [MonSTR-2.0] ProgressMeter: Opening priors file... [MonSTR-2.0] ProgressMeter: Processing STR region chr19:64420-64429 with 5 alleles. [MonSTR-2.0] ProgressMeter: Processing STR region chr19:68062-68073 with 8 alleles. [MonSTR-2.0] ProgressMeter: Processing STR region chr19:73165-73179 with 6 alleles. [MonSTR-2.0] ProgressMeter: Processing STR region chr19:77098-77113 with 7 alleles. [MonSTR-2.0] ProgressMeter: Processing STR region chr19:81518-81529 with 6 alleles. [MonSTR-2.0] ProgressMeter: Processing STR region chr19:82229-82240 with 7 alleles. [MonSTR-2.0] ProgressMeter: Processing STR region chr19:84718-84729 with 5 alleles. [MonSTR-2.0] ProgressMeter: Processing STR region chr19:86447-86478 with 7 alleles. [MonSTR-2.0] ProgressMeter: Processing STR region chr19:88826-88837 with 6 alleles. [MonSTR-2.0] ProgressMeter: Processing STR region chr19:92094-92119 with 12 alleles. [MonSTR-2.0] ProgressMeter: Processing STR region chr19:92177-92191 with 6 alleles. [MonSTR-2.0] ProgressMeter: Processing STR region chr19:98941-98952 with 6 alleles. [MonSTR-2.0] ProgressMeter: Processing STR region chr19:111045-111059 with 6 alleles. [MonSTR-2.0] ProgressMeter: Processing STR region chr19:121729-121738 with 5 alleles. [MonSTR-2.0] ProgressMeter: Processing STR region chr19:124645-124656 with 6 alleles. [MonSTR-2.0] ProgressMeter: Processing STR region chr19:127790-127801 with 6 alleles. [MonSTR-2.0] ProgressMeter: Processing STR region chr19:128904-128915 with 9 alleles. [MonSTR-2.0] ProgressMeter: Processing STR region chr19:141013-141027 with 6 alleles. [MonSTR-2.0] ProgressMeter: Processing STR region chr19:141556-141567 with 6 alleles. [MonSTR-2.0] ProgressMeter: Processing STR region chr19:161680-161691 with 6 alleles. [MonSTR-2.0] ProgressMeter: Processing STR region chr19:164287-164302 with 7 alleles. [MonSTR-2.0] ProgressMeter: Processing STR region chr19:178430-178439 with 5 alleles. [MonSTR-2.0] ProgressMeter: Processing STR region chr19:183794-183805 with 7 alleles. [MonSTR-2.0] ProgressMeter: Processing STR region chr19:184457-184468 with 6 alleles. [MonSTR-2.0] ProgressMeter: Processing STR region chr19:185253-185264 with 8 alleles. [MonSTR-2.0] ERROR: Failed to extract string FORMAT value from VCF record

gymreklab commented 3 years ago

Thanks for pointing this out! For now, I have made an updated docker here: https://hub.docker.com/r/gymreklab/monstr based on the dockerfile in this repo. But, I'm going to move this issue over to the TRTools repo so we can set up the dockers in a more organized way to deal with different tool versions.