bcbio / bcbio-nextgen

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis
https://bcbio-nextgen.readthedocs.io
MIT License
986 stars 354 forks source link

Clarify docs on germline-somatic calling for tumor samples #1866

Closed lbeltrame closed 7 years ago

lbeltrame commented 7 years ago

The example configuration shows that you can use germline even with a tumor sample, but at least when re-running an already done analysis, this does not seem to happen.

Is there anything needed, does anything need to be deleted for it to run properly? Or is it unsupported?

For some samples it may be necessary to check the presence of variants at a germline in both phenotypes.

ohofmann commented 7 years ago

I can't comment on existing projects but let me know if you get a chance to compare results. I'm currently exploring different callers and looking for cases where variants get called in both phenotypes and exploring ways to filter these.

chapmanb commented 7 years ago

Luca -- apologies about the confusion. I've amended the documentation to be more clear on this. bcbio does the germline calling for tumor/normal pairs only and uses the normal sample for the germline calls. The thinking is that the majority of callers like VarDict, FreeBayes and VarScan are already essentially calling the germline in the tumor but just subtracting them out based on the normal. So the values are there for any downstream comparison work, but having a single set of somatic and germline calls for the pair makes most standard interpretation more straightforward. Hope this helps.

lbeltrame commented 7 years ago

FTR, the ambiguity I had comes from there https://github.com/chapmanb/bcbio-nextgen/blob/879431f45433a3089943890c078a8e47f68b1ee8/tests/data/automated/run_info-cancer.yaml#L31

I saw that when looking for examples of germline calling in tumor paired samples.