theiagen / public_health_bioinformatics

Bioinformatics workflows for genomic characterization, submission preparation, and genomic epidemiology of pathogens of public health concern.
GNU General Public License v3.0
33 stars 15 forks source link

[TheiaProk] tbprofiler optimizations #427

Open kapsakcj opened 2 months ago

kapsakcj commented 2 months ago

:cool:

:pushpin: Explain the Request

tb-profiler runs single-threaded, despite WDL task requesting 8 CPUs

https://github.com/theiagen/public_health_bioinformatics/blob/2f1b35163126571e3f349707fd64efeb1aee348f/tasks/species_typing/mycobacterium/task_tbprofiler.wdl#L18

Need to add --threads ~{cpus} and we could consider --ram ~{memory} to limit max RAM usage.

We should also consider upgrading to the latest version of tb-profiler. The current version as of today is v6.2.0 and the default is v4.4.2

:books: Context

:chart_with_upwards_trend: Desired Behavior

:information_source: Additional Information