brentp / slivar

genetic variant expressions, annotation, and filtering for great good.
MIT License
247 stars 23 forks source link

hemizygous variants didn't get picked #154

Open lindakjcao opened 11 months ago

lindakjcao commented 11 months ago

Hi,

We have two hemizygous variants failed to be picked by the silver cmd line below:

  Silver '--group-expr "proband:'+
      'proband_has_variant(kid) && '+
      'variant.FILTER == \'PASS\' '+
      '&& \'FT\' in kid && kid.FT == \'PASS\' '+
      '&& variant.QUAL >= 30"'

It seems the logic probing_has_variant(kid) doesn't handle this situation when GT = 1 (hemizygous). Could you please take a look? The version of Silver is v0.2.7. Thank you. Here are the variants:

CHROM POS ID REF ALT QUAL FILTER INFO FORMAT kid mom dad

chrX 154535277 rs1050829 T C 199.46 PASS AC=2;AF=0.500;AN=4;DP=68;FS=0.000;MQ=250.00;MQRankSum=4.951;QD=3.50;ReadPosRankSum=1.052;SOR=0.521;DB GT:AD:AF:DP:GQ:FT:F1R2:F2R1:PL:GP:PP:DN 1:0,23:1.000:23:119:PASS:0,13:0,10:154,0:1.1923e+02,0.0000e+00:154,0:Inherited 0:21,0:0.000:11:99:PASS:.:.:0,384:.:0,258:. 0/1:18,16:0.471:34:48:PASS:7,6:11,10:85,0,50:4.9792e+01,6.7304e-05,5.3000e+01:130,0,53:. chrX 154536002 rs1050828 C T 186.46 PASS AC=2;AF=0.500;AN=4;DP=67;FS=7.399;MQ=250.00;MQRankSum=4.540;QD=3.39;ReadPosRankSum=3.433;SOR=1.302;DB GT:AD:AF:DP:GQ:FT:F1R2:F2R1:PL:GP:PP:DN 1:0,19:1.000:19:107:PASS:0,7:0,12:141,0:1.0672e+02,0.0000e+00:141,0:Inherited 0:21,0:0.000:11:99:PASS:.:.:0,384:.:0,258:. 0/1:18,18:0.500:36:48:PASS:8,8:10,10:85,0,50:5.0000e+01,6.5233e-05,5.3000e+01:130,0,53:.

brentp commented 11 months ago

proband_has_variant is not in the js that I distribute with slivar, so you'll have to share that code.

lindakjcao commented 11 months ago

Thank you so much for the quick response. Sorry about that. Here is the code: function proband_has_variant(proband) { return proband.alts > 0 }

brentp commented 11 months ago

slivar is most tested on diploid variants but I think this should work. Are you using the latest version?

lindakjcao commented 11 months ago

we used v0.2.7. Probably not the latest one.

brentp commented 11 months ago

please try with latest then we will check from there.

lindakjcao commented 11 months ago

Hi, we found another case that has same hemi variant at 154535277, it was picked by Silvar. The only difference is the case is a singleton. So hemizygosity seems not to be the cause but something else. Any suggestion? chrX 154535277 rs1050829 T C 80.10 PASS AC=1;AF=1.000;AN=1;DP=10;FS=0.000;MQ=250.00;QD=8.01;SOR=0.693;DB GT:AD:AF:DP:GQ:FT:F1R2:F2R1:PL:GP 1:0,10:1.000:10:80:PASS:0,6:0,4:115,0:8.0096e+01,0.0000e+00

brentp commented 11 months ago

can you try with latest version? if the results are not what you expect, then send a valid VCF with the 2 variants and the VCF header and I will take a look.

weixuanfu commented 11 months ago

Hi @brentp thank you for replying this issue so quickly.

I made some mocked files to reproduce this issue.

The 1st set is a trio (attached trio.zip) that mentioned above. proband.zip trio.zip

  1. I tested command below with silvar v0.3.0, 0 variant can pass
/tools/slivar/0.3.0/slivar expr \
  --pass-only \
  --vcf test.vcf \
  --ped ped.txt \
  --alias alias.txt \
  --js filters.js \
   \
   --group-expr "proband:proband_has_variant(kid)" \
  --out-vcf test_out.vcf

2 I tested command below with inputs in proband.zip (which only has proband's genotype), 2 variants can pass

/tools/slivar/0.3.0/slivar expr \
  --pass-only \
  --vcf test.single.vcf \
  --ped ped.single.txt \
  --alias alias.single.txt \
  --js filters.js \
   \
  --group-expr "proband:proband_has_variant(kid)" \
  --out-vcf test_out.vcf
  1. However, still with inputs in proband.zip, I changed the expression as shown below, then 0 variant can pass:
/tools/slivar/0.3.0/slivar expr \
  --pass-only \
  --vcf test.single.vcf \
  --ped ped.single.txt \
  --alias alias.single.txt \
  --js filters.js \
   \
  --group-expr "proband:proband_has_variant(kid) && variant.FILTER == 'PASS' && 'FT' in kid && kid.FT == 'PASS' && variant.QUAL >= 30" \
  --out-vcf test_out.vcf
weixuanfu commented 11 months ago

Correction for test 3 above. It works with export SLIVAR_FORMAT_STRINGS=1

weixuanfu commented 11 months ago

I also tested duo family structure: duo.nodad.zip duo.nomom.zip

Test 4. duo of kid + mom (kid (GT=1)+mom (GT=0/1) in duo.nodad.zip) with commend below, silver cannot pick the 2 variants

export SLIVAR_FORMAT_STRINGS=1
/mnt/isilon/dgd_public/clin-air/v2.0.0/tools/slivar/0.3.0/slivar expr \
  --pass-only \
  --vcf test.duo.nodad.vcf \
  --ped ped.duo.nodad.txt \
  --alias alias.duo.nodad.txt \
  --js filters.js \
   \
  --group-expr "proband:proband_has_variant(kid) && variant.FILTER == 'PASS' && 'FT' in kid && kid.FT == 'PASS' && variant.QUAL >= 30" \
  --out-vcf test_out.vcf

However Test 5. duo for kid + dad (kid (GT=1) + dad (GT=0) in duo.nomom.zip) works with commands below, both variants can be picked up.

export SLIVAR_FORMAT_STRINGS=1
/mnt/isilon/dgd_public/clin-air/v2.0.0/tools/slivar/0.3.0/slivar expr \
  --pass-only \
  --vcf test.duo.nomom.vcf \
  --ped ped.duo.nomom.txt \
  --alias alias.duo.nomom.txt \
  --js filters.js \
   \
  --group-expr "proband:proband_has_variant(kid) && variant.FILTER == 'PASS' && 'FT' in kid && kid.FT == 'PASS' && variant.QUAL >= 30" \
  --out-vcf test_out.vcf

I guess that silvar may not handle haploid and diploid variants at the same time.

weixuanfu commented 11 months ago

I have made another duo with kid + mom but only change kid's GT from 1 to 1/1 in vcf (kid's GT = 1/1 and mom' GT = 0/1), then slivar can pick both variants when there is no mix of haploid and diploid GT in vcf.

weixuanfu commented 11 months ago

I checked the source codes about genotypes. the kid.alts should be -1 in this mix type GT in vcf.

So if I change the function proband_has_variant from

function proband_has_variant(proband) {
    return proband.alts > 0
}

to

function proband_has_variant(proband) {
    return proband.AD[1] > 0
}

the 2 variants can be picked in all the tests above

brentp commented 11 months ago

ok. I think that's a better way to go. alts really makes most sense for diploid variants. also, I think you shouldn't use both --ped and --alias

weixuanfu commented 11 months ago

will silvar have better support for hemizygous genotypes? like alts=4 for hem-ref and alts=5 for hem-alt.

brentp commented 11 months ago

Yes, I committed last month code to handle accessing the genotype: https://github.com/brentp/slivar/commit/9f7d995686f1324a9b94381c4ee2caf0feca501d

so you'll be able to do sample.GT which would be [0, 1] for a heterozygous allele, and [1] for a hemizyous variant. I'll try to get this out next week.

weixuanfu commented 11 months ago

Thank you! Could you please let me know how to compile the beta version of silvar from source codes? I need it to test the logic below:

(proband.alts > 0 || (proband.alts < 0 &&  proband.GT[0] == 1)
brentp commented 11 months ago

You can try it the attached slivar_dev.gz

just download, gunzip, chmod +x and run as ./slivar_dev