stjudecloud / workflows

Bioinformatics workflows developed for and used on the St. Jude Cloud project.
MIT License
33 stars 10 forks source link

feat(QC): adds optional optical dup marking #142

Closed a-frantz closed 5 months ago

a-frantz commented 5 months ago

Nothing too complicated for anyone familiar with WDL code, though I do worry the behavior is a bit confusing for users. There are two new parameters (only one of which is related to the title).

  1. optical_distance which defaults to zero. If it is >0 AND mark_duplicates == true then optical dup marking will be enabled in Picard. Small-ish problem: we need to repeat ourselves because we need appropriate docs in both QC and picard.wdl. If there's a way I haven't thought of to cut down on that repetition LMK.
  2. store_kraken_sequences is very straightforward. This should have been done a while ago but was overlooked. Tangentially related to the other changes at best, but when has this repo enforced our PRs keep a narrow scope lol.

Last change worth pointing out is that I've added the conditions under which optional outputs are created. store_kraken_sequences is related to this, because without that change this would have been bizarre.