Closed samuelklee closed 4 years ago
Just noting, after today's discussion we can probably move forward by just trying to relax downstream validation stringency (at least for now---the invalid reads may actually cause some tools to fail, in which case we do need to go back and reexamine demultiplexing/alignment). But I will continue investigation of the issue with the unmapped/split reads to make sure it is indeed a separate one. So the next step will be to run M2 out of the box.
Thanks @fleharty! @JonKeatley112 any comments?
@samuelklee When you run this, do you use the methods cromwell server?
I set up a Terra workspace (broad-firecloud-dsde/malariagen-dev) just to double check that it ran on the cloud successfully. It only takes a few minutes to run either there or locally.
@samuelklee Could you provide a link to the Terra workspace (oh never mind, you did...)
Is there a reason this isn't merged yet?
Was waiting to see if @JonKeatley112 wanted to comment, but I’ll go ahead and merge.
@samuelklee Could you share the Terra workspace broad-firecloud-dsde/malariagen-dev with me, I can't seem to access it.
Very basic port of Stage 1, Step 3 of the amplicon SNP calling parasite pipeline. Adds a Dockerfile (for a conda environment that contains all necessary tools), a single-task WDL, and a test JSON.
Note that the test JSON points to a CRAM and a VCF (containing targets for collecting pileups) which have been "lifted over"---i.e., in contrast to files produced by the original pipeline, which performed alignment to a reference containing separate contigs for each panel target, these files are with respect to the full reference.
In putting this together, I discovered other issues in upstream steps that may need to be corrected (in addition to aligning against the full reference) before we can run GATK/Picard-based comparisons against this baseline. For example, Step 2 does not produce a valid CRAM file according to Picard ValidateSamFile. There are also some issues with unmapped/split reads that I'd like to understand.
We can fix up things and/or reorganize things as we move along, this is just to get us started.
Closes #37.