nextflow-io / nextflow

A DSL for data-driven computational pipelines
http://nextflow.io
Apache License 2.0
2.74k stars 628 forks source link

feature request: Make "/bin/bash" configurable #5420

Open stianlagstad opened 1 week ago

stianlagstad commented 1 week ago

I would like to make /bin/bash configurable, either through setting an environment variable or through setting a configuration value. Would a PR for this be welcome? Is there anyone I should discuss this with first? Or are there things I should know before I start?

One complicating factor I'm aware of is that there may be a need to use one shell on the host system (or the external worker, such as a GCP batch instance), and a different shell inside of the docker/singularity container. Maybe there would be need of two environment variables.

Prior work and comments on this:

My main motivations for this idea are:

  1. To be able to use nextflow on systems where /bin/bash is not available, and where users can't easily setup a symlink or similar.
  2. Get more reproducible workflows, as we can control which version of bash is used.
  3. To avoid the many duplicate hardcoded strings "/bin/bash" present in the nextflow repository. Running the command find . -name "*.groovy" -type f -exec grep -o "/bin/bash" {} + | wc -l on the current master branch gives 106 results. If there were (many) fewer of these, then other packaging tools would be able to package nextflow and write a small patch which changes the occurrences of these strings.

Any input is appreciated. Thank you!

bentsherman commented 1 week ago

Excluding tests, there are about 15 hardcoded references to /bin/bash, most of them related to launching the .command.run script on a particular executor

I believe the launcher script has certain assumptions baked in around bash, so it would make sense to point to a different version of bash but not necessarily a different shell

It could make sense as an environment variable or a config option such as executor.bash