theiagen / public_health_bacterial_genomics

GNU Affero General Public License v3.0
26 stars 14 forks source link

add optional input to merlin_magic wf for skipping serotypefinder and specify docker image #183

Closed kapsakcj closed 1 year ago

kapsakcj commented 1 year ago

This PR adds an optional boolean input skip_serotypefinder to the merlin magic workflow to allow the user to skip the usage of serotypefinder via the Esherichia sub-workflow. It is by default set to false but the user can set it to true

This PR also adds an optional input String? serotypefinder_docker_image to allow the user to specify the docker image used (in case there are new versions & docker images in the future)

The original rationale for adding this is to avoid cryptic & hard-to-troubleshoot bugs with serotypefinder. I've very rarely encountered errors with SerotypeFinder and want TheiaProk workflow to run successfully despite the rare serotypefinder error.

More details on the SerotypeFinder bug here, its related to blastn output and XML parsing: https://bitbucket.org/genomicepidemiology/serotypefinder/issues/3/xmlparsersexpatexpaterror-mismatched-tag

This error has since been resolved ^ and the docker image staphb/serotypefinder:2.0.1 will soon be updated to prevent the error docker image has been fixed, but still would be good to have the functionality in the workflow

Tested sucessfully w/ miniwdl and in Terra.

Successful Terra test (n=1): https://app.terra.bio/#workspaces/cdph-terrabio-taborda-manual/curtis_sandbox/job_history/b5542011-622e-4c1b-a255-cdc016bf581f

kevinlibuit commented 1 year ago

@kapsakcj did the updated staphb docker resolve this issue for this particular serotpye? Not sure if we want to open the option of making specifically serotypefinder optional unless we want to invite that functionality for all other merlin_magic tasks.

kapsakcj commented 1 year ago

Yeah, the updated docker image resolves the issue.

I can revert the changes for skip_serotypefinder, but do we want to keep the change I made in regards to serotypefinder_docker? So that users can change the docker image should new versions be released?

Not critical to keep since serotypefinder rarely has updates.

We could just close this PR and kill the branch 🤷

kapsakcj commented 1 year ago

Closing since we do not want the ability to skip certain modules (including serotypefinder).

Will eventually open a separate PR which allows the user to specify the docker image used for each task in merlin_magic