arq5x / gemini

a lightweight db framework for exploring genetic variation.
http://gemini.readthedocs.org
MIT License
317 stars 119 forks source link

Exception: gt_bases not implemented for ploidy > 2 #931

Open gormleymp opened 4 years ago

gormleymp commented 4 years ago

I have some odd VCFs where the GTs appear to be all triallelic but these do not seem to match the REF and ALT fields. VCFs were generated using freebayes but I do not have much more information than this. I am getting an error from cyvcf2 reading "Exception: gt_bases not implemented for ploidy > 2". I assume this error is coming from the odd GT fields. I have tried to decompose these variants using vt, bcftools, vcflib but these tools don't seem to recognize these genotypes as multiallelic. Any thoughts on how to resolve?

REF = A ALT = G TUMOR GT = 0/0/1 NORMAL GT = 0/0/1

CHROM POS ID REF ALT QUAL FILTER INFO FORMAT TUMOR NORMAL

1 63643 1:63643_A/G A G 16927.1 . NS=2;DP=1710;DPB=1710;AC=2;AN=6;AF=0.333333;RO=1097;AO=613;PRO=0;PAO=0;QR=44389;QA=24928;PQR=0;PQA=0;SRF=599;SRR=498;SAF=357;SAR=256;SRP=23.2028;SAP=39.146;AB=0.35848;ABP=300.484;RUN=1;RPP=163.724;RPPR=209.526;RPL=200;RPR=413;EPP=60.1452;EPPR=4.24747;DPRA=0;ODDS=116.368;GTI=0;TYPE=snp;CIGAR=1X;NUMALT=1;MEANALT=1;LEN=1;MQM=56.3605;MQMR=50.3263;PAIRED=0.998369;PAIREDR=0.999088;technology.ILLUMINA=1;ANN=G|intergenic_region|MODIFIER|FAM138A-OR4F5|FAM138A.2-OR4F5|intergenic_region|FAM138A.2-OR4F5|||n.63643A>G|||||| GT:GQ:DP:AD:RO:QR:AO:QA:SRF:SRR:SAF:SAR 0/0/1:160.002:242:156,86:156:6366:86:3518:80:76:45:41 0/0/1:160.002:1468:941,527:941:38023:527:21410:519:422:312:215

brentp commented 4 years ago

hi, unfortunately, you won't be able to use gemini for vcfs like this that have non-diploid genotypes. it's a separate issue from decomposing.