Large PR for creation of a tumor only Mutect2 workflow (the production workflow can actually run both somatic and tumor only). I'll just go through the commits and breakdown what I did:
c2a5168: Added a small test script designed to be run pre-commit
2536e55: Ported the Mutect2 production workflow, subworkflows, and tools from the somatic repo
f4d82bf: Added a SplitIntervals tool. Wanted to see what level of similarity it has with our IntervalListTool. Turned out to give different intervals so I decided to not replace our existing tool.
7f34b94: Added GetSampleName tool. This tool is part of the BROAD pipeline whereas our pipeline relies on the user providing those values. I also cleaned up the existing pipeline since it was throwing cwltool errors.
c63016a: Updated the mutect2 tool. Basically modernizing the tool and opening it up options for users. Also made a lot of styling changes so that the tool would no longer throw errors with sbpack
0f33aec: I pulled the learn orientation bias tool out of the filter support subworkflow. The reason for this was that the tool was stuck waiting for Mutect2 to finish in the filter support subworkflow. Pulling it out of the subworkflow allows it to run concurrently with Mutect2. In the end it turned out to be a bit of an optimization for optimization's sake. No runtime or cost savings were observed. Also got rid of hard coded scatter count input.
a7ebafc: Added steps and flags that allow the user to create a BAM output from Mutect2. Added as a conditional step. Only really added this because it is a part of the GATK best practices workflow. We won't be using it ourselves but users can use it if they like.
e2fa90b: Added two tools bwaindeximagecreator and filteralignmentartifacts. Also wrapped the latter tool in the workflow as a conditional step. Again this functionality was added because it is part of the GATK best practice workflow but at the same time will not be used by us as far as I can tell. The tool is still very buggy and not recommended for production runs.
0a1e57b: Major change here is to make all extra Mutect2 stuff optional with flags. Gave FilterMutectCalls and CalculateContamination a modernization. Also quite importantly updated FilterAlignmentArtifacts docker to GATK 4.2; the GATK tool was complete garbage on 4.1 and received a complete rewrite for 4.2.
e421262: After testing the previous commit I moved forward with bringing everything in line with the original somatic workflow. I fleshed out all of the documentation and defaults for the production workflow so that it's more straightforward. With respect to the defaults, if I user provides nothing but the basics to the workflow it will run the same steps as our somatic workflow. Also some minor formatting cleanup for readability.
8b57515: Big change of adding the Panel of Normals workflow and tools. Pretty straightforward workflow that scatters over the normal crams running a minimal Mutect2 subworkflow then collecting those outputs and creating the panel. Couple of quick bug fixes here: first, my extra_args inputs needed shellQuote set to false to avoid CWL adding quotes to my stuff and breaking commands; second, pickValue at a subworkflow output was causing issues in Cavatica where scattered jobs were not returning file arrays but rather single file objects (offloading this functionality to an expressiontool step was the workaround I chose).
de63e79 + 8741dc1: New docs for the somatic panel workflow and updating the two workflows to public app formatting
3bc6edb: Decided I didn't like the disharmony in our GATK tools so I went through and did some basic cleanup so they all look mostly the same.
Description
Large PR for creation of a tumor only Mutect2 workflow (the production workflow can actually run both somatic and tumor only). I'll just go through the commits and breakdown what I did:
pre-commit
Part of https://github.com/d3b-center/bixu-tracker/issues/102
Type of change
How Has This Been Tested?
Test Configuration:
Checklist: