When using cwltool with the --cachedir option, its command_line_tool builds up a cache key from the tool's input files, command-line and requirements (e.g. docker image). Through experimentation, I found that the command-lines for fastqc and trim_galore were changing on each run of the workflow.
These tools used $(runtime.outdir) to build their command-lines. The runtime outdir is a random directory and changes on every run, causing these steps to never be cacheable. Since all of the downstream processing depends on the trimmed reads, the rest of the workflow was never cacheable.
This PR changes those definitions to use CWL features (InitialWorkdirRequirement for fastqc) or lean on tool default behavior (for trim_galore), allowing every step of the preprocessing workflow to be cached.
When using cwltool with the
--cachedir
option, itscommand_line_tool
builds up a cache key from the tool's input files, command-line and requirements (e.g. docker image). Through experimentation, I found that the command-lines for fastqc and trim_galore were changing on each run of the workflow.See https://github.com/common-workflow-language/cwltool/blob/4a31f2a1c1163492ae37bbc748a299e8318c462c/cwltool/command_line_tool.py#L328-L355
These tools used
$(runtime.outdir)
to build their command-lines. The runtime outdir is a random directory and changes on every run, causing these steps to never be cacheable. Since all of the downstream processing depends on the trimmed reads, the rest of the workflow was never cacheable.This PR changes those definitions to use CWL features (InitialWorkdirRequirement for fastqc) or lean on tool default behavior (for trim_galore), allowing every step of the preprocessing workflow to be cached.