brentp / slivar

genetic variant expressions, annotation, and filtering for great good.
MIT License
248 stars 23 forks source link

[slivar] warning and confusing results #81

Closed lconde-ucl closed 3 years ago

lconde-ucl commented 3 years ago

Hi, I am running slivar on a dataset that has several unrelated samples and a family with 2 affected siblings and parents. For now I am focusing only on the family, and I am running slivar with the rare disease pipeline from the wiki, with the additions for WGS suggested in the paper (depth >= 10, blacklisted regions).

I tried running each sibling separately and together, with the exact same parameters except for the PED file, with the followings results:

  1. PED with sibling 1, mum, dad

    [slivar] Finished. evaluated 12620267 total variants and wrote 25396 variants that passed your slivar expressions.
    sample  comphet_side    denovo  recessive   x_denovo    x_recessive
    sib1    24551   610 82  90  67
    dad 0   0   0   0   0   0
    mum 0   0   0   0   0   0
    [slivar compound-hets] wrote 4 variants that were part of a compound het.
    sample  compound-het
    sib1    4
  2. PED with sibling 2, mum, dad

    [slivar] Finished. evaluated 12620267 total variants and wrote 25963 variants that passed your slivar expressions.
    sample  comphet_side    denovo  recessive   x_denovo    x_recessive
    sib2    24077   677 523 388 810
    dad 0   0   0   0   0
    mum 0   0   0   0   0
    [slivar] warning: no genes found for chromosome preceding rid:23 check your gene annotations have succeeded and that you are pulling the correct sample field(s)
    [slivar compound-hets] wrote 0 variants that were part of a compound het.
    sample  compound-het
    sib2    0
  3. PED with sibling 1, sibling 2, mum, dad

    [slivar] Finished. evaluated 12620267 total variants and wrote 154203 variants that passed your slivar expressions.
    sample  comphet_side    denovo  recessive   x_denovo    x_recessive
    sib1    101948  91  125 20  10
    sib2    100754  91  125 20  10
    dad 0   0   0   0   0
    mum 0   0   0   0   0
    [slivar compound-hets] wrote 62 variants that were part of a compound het.
    sample  compound-het
    sib1    44
    sib2    44

So I have the following questions:

  1. what is the warning that I get in run 2?
  2. why do I get more comp-hets when I run both siblings together than when I run them separately? The only difference between the above runs is the PED file. My understanding is that run 3 should report only the variants in common from runs 1 and 2, and while that seems to be the case for the denovo/recessive/etc models, it is not the case for the compound-hets, but maybe I am misunderstanding the expected behaviour of the tool?

Happy to send the VCF and script is that helps, Many thanks, Lucia

brentp commented 3 years ago

hi,

  1. I try to give warnings if something seems odd. It could be that is chrY and you have no variants in a female sample. If that's the case, you can ignore but it seems there might be something odd about sib2.
  2. I'm not sure. It always helps to show the full commands that you ran. Along with full stderr from each command. showing your ped file would also be helpfu.
lconde-ucl commented 3 years ago

Hi Bent, thanks, sibling 2 is a male (sibling 1 is female but that one didn't show any warnings). I'll send you by email the full commands, full stderr, peds and VCF for you to replicate the issue Regards Lucia

lconde-ucl commented 3 years ago

Hi Brent,

Sorry I found that the issue was that I had a typo, I had 'INFO.topmed_af < 0.0’ (as opposed to 'INFO.topmed_af < 0.05’) when I run the siblings individually. Once I corrected the typo, the warning went away and there are no issues with the numbers of comp-hets.

Thanks and sorry about that! Lucia