populationgenomics / production-pipelines

Genomics workflows for CPG using Hail Batch
MIT License
6 stars 1 forks source link

CramQC stage can not run on exomes #294

Open cassimons opened 1 year ago

cassimons commented 1 year ago

In the recent update to analysis_keys, the CramQC stage now hard codes picard_wgs_metrics as an expected analysis_key. However, when running on exome data this stage will produce picard_hs_metrics not picard_wgs_metrics.

In the current state, if the pipeline is run on exome data you get the following error (edited for clarity):

cpg_workflows.workflow.WorkflowError: Cannot create Analysis for stage CramQC: `analysis_keys` "
    ['somalier', 'verify_bamid', ... 'quality_yield_metrics', 'picard_wgs_metrics']" 
is not a subset of the expected_outputs keys dict_keys(
    ['somalier', 'verify_bamid', ... 'quality_yield_metrics', 'picard_hs_metrics'])

I am not sure where this target dependent behaviour should be handled?

vivbak commented 1 year ago

This is annoying, sorry @cassimons! Is it blocking you? If so, I'll remove these keys quickly now while I think about the best way to handle this.

cassimons commented 1 year ago

Thanks @vivbak, that fix has got the pipeline running again for the moment.