bcbio / bcbio-nextgen

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis
https://bcbio-nextgen.readthedocs.io
MIT License
986 stars 354 forks source link

Re-Deriving Ensemble Mutations w/ different numpass parameter #1977

Closed ghost closed 7 years ago

ghost commented 7 years ago

Hey,

I've used ensemble mutation calling to call mutations which appear in 2/5 individual callers. I'm now wanting to go back and re-call those which appear in 3/5, 4/5, 5/5, and 1/5. Is there a way of doing this without doing the whole mutation-calling step again, but using the individual called VCF files as input? What would the yaml for this look like? I've attached the basic format of the one I'm using.

details:

Best wishes,

Nick

chapmanb commented 7 years ago

Nick; Thanks for the question. We don't have a clean way to re-run the ensemble calling from within bcbio, unless you've saved the work directory. If you have the work directory, then you can manually remove the ensemble directory, adjust the YAML settings, and then re-run producing a new ensemble callset.

Alternatively, you could run with with the minimum numpass you want (1 in this case) and then do custom post-filtering for higher sets using the CALLERS attribute in the INFO field. That's probably the easiest way to avoid a lot of manual re-running and editing.

Hope this helps.

ghost commented 7 years ago

Hey,

Thanks a lot, I'll give it a go.