DataBiosphere / toil

A scalable, efficient, cross-platform (Linux/macOS) and easy-to-use workflow engine in pure Python.
http://toil.ucsc-cgl.org/.
Apache License 2.0
884 stars 237 forks source link

CWL: config file for always-needed toil-cwl-runner options #4142

Open mr-c opened 2 years ago

mr-c commented 2 years ago

Many sites that run toil-cwl-runner, especially HPC systems, need quite a lot of options defined. It would be nice if there was a config file that formed the basis of the command-line options.

See https://github.com/EBI-Metagenomics/emg-viral-pipeline/blob/f367002f0e1e375840e5696323bde65f7accb31f/cwl/virify.sh#L300 for an advanced script from a toil-cwl-runner user

Where? In toil-cwl. + ini or toml or somesuch; under ${XDG_CONFIG_HOME} (which defaults to ${HOME}/.config)

Additional command line options should have priority. The location of the config file should also be overridable with an environment variable.

┆Issue is synchronized with this Jira Story ┆friendlyId: TOIL-1186

adamnovak commented 2 years ago

We have wanted a config file for a while; we should be able to hook into some of the code we have already to pull from either the command line or environment variables.

It might also be nice if it could integrate well with user Toil workflows, so e.g. Cactus or toil-vg could ride the same system.

Lon had the idea of a config generator to show you all the defaults.

@mr-c Do you want to implement this system?

adamnovak commented 9 months ago

@stxue1 Did your new config file stuff in #4569 address this?

stxue1 commented 9 months ago

It looks like the config file does not work for toil-cwl-runner. I think configargparse has an issue with nargs=argparse.REMAINDER. It looks like if another nargs/append action is in the mix, the REMAINDER argparse option swallows all config file options.