arq5x / lumpy-sv

lumpy: a general probabilistic framework for structural variant discovery
MIT License
309 stars 118 forks source link

Question about filtering germline or somatic SV? #268

Closed lss1227 closed 6 years ago

lss1227 commented 6 years ago

Hi, I have read many posts about filtering Lumpy SV results after genotyping with SVTyper but there are still questions confusing me.

  1. For normal sample, should I filter germline SV following these steps? a) I think we should remove SVs whose GT==0/0.
    b) make sure that total coverage over the SV is also reasonable (filter on DP? but I don't know the threshold for 30x WGS. ) c) filter on SU ? In Delly, SVs are flagged as "PASS" if >=3 paired-ends support the variant. SU has a strict mapping quailty filter (from this), which I think makes more reasonable than using observation counts.
    d) I also note that in this, QUAL scores are mentioned. Should I care about this ?

  2. For tumor-normal pairs: a) keep non-reference SVs in the tumor; b) keep SVs which have no alternate depth (AO==0) in normal; c) sufficient depth ? there are two answers referring to depth: (1) says that potential somatic SVs have sufficient depth in the normal (RO>~7) but in (2) it says that only consider variants that have sufficient depth (DP) in tumor and normal. Which filter should I use, RO or DP ? And what is the threshold for 30x WGS?

Thank you very much for any idea !

ryanlayer commented 6 years ago

On Tue, Sep 4, 2018 at 4:20 AM lss1227 notifications@github.com wrote:

Hi, I have read many posts about filtering Lumpy SV results after genotyping with SVTyper but there are still questions confusing me.

1.

For normal sample, should I filter germline SV following these steps? a) I think we should remove SVs whose GT==0/0. b) make sure that total coverage over the SV is also reasonable (filter on DP? but I don't know the threshold for 30x WGS. ) c) filter on SU ? In Delly, SVs are flagged as "PASS" if >=3 paired-ends support the variant. SU has a strict mapping quailty filter (from this https://groups.google.com/d/msg/lumpy-discuss/aKpdTth7vBc/NFBct-p5CgAJ), which I think makes more reasonable than using observation counts. d) I also note that in this https://github.com/hall-lab/svtyper/issues/10#issuecomment-413704489, QUAL scores are mentioned. Should I care about this ?

Fileter on the SVTYPER fields AO and RO, not on SU. As you point out, SU has strict mapping quality filters and it does not capture the normal signal in that region. QUAL scores are helpufl, but because I focus on AO and RO I do not have a good feel for what QUAL ranges are good.

1.

For tumor-normal pairs: a) keep non-reference SVs in the tumor; b) keep SVs which have no alternate depth in reference (AO==0); c) sufficient depth ? there are two answers referring to depth . (1) https://github.com/arq5x/lumpy-sv/issues/207#issuecomment-344651452 says that potential somatic SVs have sufficient depth in the normal (RO>~7) but in (2) https://github.com/arq5x/lumpy-sv/issues/249#issuecomment-398623306 it says that only consider variants that have sufficient depth (DP) in tumor and normal. Which filter should I use, RO or DP ?

DP is close to RO + AO, so either way is fine.

1.

Thank you very much for any idea !

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/arq5x/lumpy-sv/issues/268, or mute the thread https://github.com/notifications/unsubscribe-auth/AAlDUXtKz4h2l5k8ESPVhYfSxv2KNEERks5uXlPhgaJpZM4WYs8N .

lss1227 commented 6 years ago

Thank you very much