changes made during test run-through using singularity containers and sliced fastq files (200 records).
updated variable names for consistency during test run through
add params.ID
added missing indexing step after samtools sort on marked duplicate file
notes:
I did not split and run chroms in parallel with mutect2, but these test files have several chromosomes including UNK, so they could be a good test for that part of the pipeline as well.
there are separate dockerfiles and images for mutect and gatk; I ran this without touching the mutect image and only using gatk. If we update to remove the mutect image and have just one gatk image, should we also update the directory structure so mutect is nested within gatk with the rest of the gatk tools?
snpeff annotated all variants regardless of filter = PASS status, is this expected?
these are probably ok, but I didn't QC file output from LearnReadOrientations because I don't know what I'm looking for, and neither calc contamination or getpileupsummaries generated results (too few records?).
params variables are currently set up to point to the same reference genome twice (idx and mutect_idx), or at least I think it's the same. I did not edit this but wondering if these should be the same variable.
the file names are getting really long; it might be worth going through and replacing "_" with "." so we can use simpleName to keep them short
changes made during test run-through using singularity containers and sliced fastq files (200 records).
notes:
simpleName
to keep them short