brentp / slivar

genetic variant expressions, annotation, and filtering for great good.
MIT License
248 stars 23 forks source link

Missing de novo variant #129

Open krukanna opened 2 years ago

krukanna commented 2 years ago

Hi @brentp, I type possibly de novo variant and then try to confirm it in slivar. There are cases when slivar does not indicate it. Do you know what criteria are likely to exclude this variant?

My command:

/home/bioinf/Ania/De_novo/slivar/slivar expr \
   --pass-only \
   --vcf $VCF_ann \
   --ped $SAMPLES_RELATION \
   --gnotate /slivar/gnomad.hg38.genomes.v3.fix.zip \
   --js /slivar/slivar-functions.js \
   --out-vcf $DIR_OUTPUT/de_novo_output.vcf \
   --info 'INFO.impactful && INFO.gnomad_popmax_af < 0.05 && variant.FILTER == "PASS" && variant.ALT[0] != "*"' \
   --family-expr 'denovo:fam.every(segregating_denovo) && INFO.gnomad_popmax_af < 0.05'

FAM File:

family_ID sample_ID paternal_ID maternal_ID sex phenotype (2 is case)

R1 sister father mother 2 2 R1 proband father mother 2 1 R1 mother 0 0 2 2 R1 father 0 0 1 2

Variant from vcf file:

CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  sister      proband      mother      father
chr19   49597592        .       C       A       1234.27 PASS    AC=1;AF=0.004386;AN=228;CSQ=A|missense_variant|MODERATE|PRR12|ENSG00000126464|Transcript|ENST00000418929|protein_coding|4/14||||3730|3257|1086|P/Q|cCa/cAa||1||1||SNV|HGNC|HGNC:29217|YES|NM_020719.3|5|P2|CCDS46143.1|ENSP00000394510|Q9ULL5||UPI0001596889|1|tolerated(0.26)|probably_damaging(0.999)|PANTHER:PTHR14709&PANTHER:PTHR14709:SF1||||||||||||||||||||||||||||||,A|missense_variant|MODERATE|PRR12|ENSG00000126464|Transcript|ENST00000615927|protein_coding|2/12||||794|794|265|P/Q|cCa/cAa||1||1||SNV|HGNC|HGNC:29217|||5|A2||ENSP00000478000||A0A3Q5ADB5|UPI00001C200A|1|tolerated(0.11)|probably_damaging(0.997)|PANTHER:PTHR14709&PANTHER:PTHR14709:SF1||||||||||||||||||||||||||||||,A|regulatory_region_variant|MODIFIER|||RegulatoryFeature|ENSR00000592640||||||||||||1||||SNV|||||||||||||||||||||||||||||||||||||||||||||   GT:AD:DP:GQ:PL  0/0:41,0:41:99:0,99,1485        0/1:39,44:83:99:1255,0,1144     0/0:32,0:32:65:0,65,1084        0/0:42,0:42:99:0,102,1454

Could it be caused by the fam file? If at least one of the parents is unknown or case?

brentp commented 2 years ago

That variant is only present in the unaffected sample. segregating_denovo, as written looks for variants that are only in the affected sample(s). You could modify the function to find those, but that's not how it's written now.

krukanna commented 2 years ago

Thanks for your quick respond. It's not actually unaffected, it's unknown status - I made a mistake, it was supposed to be R1 proband father mother 2 0 I also don't have any de novo, when I change FAM like this (everyone is case):

 #family_ID sample_ID paternal_ID maternal_ID sex phenotype (2 is case)
R1 sister father mother 1 2
R1 proband father mother 2 2
R1 mother 0 0 2 2
R1 father 0 0 1 2

So what if the other family members are affected (not completely healthy - it might be different disease)? Should I assume the others are unknown instead of case status?