nextflow-io / nextflow

A DSL for data-driven computational pipelines
http://nextflow.io
Apache License 2.0
2.76k stars 629 forks source link

Specify shell for batch jobs #5254

Closed caldodge closed 2 months ago

caldodge commented 2 months ago

New feature

Nextflow scripts appear to use bash features (like array variables), but some schedulers (UGE, in this case) default to /bin/sh, causing array variable code to fail. Can a parameter be added, so that in a config file users can specify /bin/bash as the shell to the scheduler?

Usage scenario

Usage case is batch operation of nextflow scripts which use bash-specific features.

Suggest implementation

Sorry, I'm not a nextflow expert. Since there are already parameters which are translated into scheduler directives, it would be a simple matter of adding another parameter ("shell", perhaps), and adding code to translate that to a scheduler directive ("#$ -S /bin/bash"). If someone could identify the code which does this translation, I would try tackling it myself.

bentsherman commented 2 months ago

If it's a directive in the job script, you should be able to add it via clusterOptions:

// nextflow.config
process.clusterOptions = '-S /bin/bash'

We could potentially add it to Nextflow here: https://github.com/nextflow-io/nextflow/blob/5a37e6177f7a0e02b2af922768a0df5984b07b7b/modules/nextflow/src/main/groovy/nextflow/executor/SgeExecutor.groovy#L39-L88

The UGE and SGE executors are identical under the hood. As long as this -S option is the same across all Grid Engine variants it should be fine, but I would try the clusterOptions approach first and see if it works for you.

caldodge commented 2 months ago

Thanks, Ben. I'll give that a try.

Calvin Dodge

On Fri, Aug 23, 2024, 12:17 PM Ben Sherman @.***> wrote:

If it's a directive in the job script, you should be able to add it via clusterOptions:

// nextflow.config process.clusterOptions = '-S /bin/bash'

We could potentially add it to Nextflow here: https://github.com/nextflow-io/nextflow/blob/5a37e6177f7a0e02b2af922768a0df5984b07b7b/modules/nextflow/src/main/groovy/nextflow/executor/SgeExecutor.groovy#L39-L88

The UGE and SGE executors are identical under the hood. As long as this -S option is the same across all Grid Engine variants it should be fine, but I would try the clusterOptions approach first and see if it works for you.

— Reply to this email directly, view it on GitHub https://github.com/nextflow-io/nextflow/issues/5254#issuecomment-2307497723, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEWSED26G6YKRK2AFWVLXOLZS5VB5AVCNFSM6AAAAABNAROEWOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMBXGQ4TONZSGM . You are receiving this because you authored the thread.Message ID: @.***>

caldodge commented 2 months ago

Ben,

Thanks again. This is providing the desired result.

Sincerely,

Calvin Dodge

On Fri, Aug 23, 2024 at 1:45 PM Calvin Dodge @.***> wrote:

Thanks, Ben. I'll give that a try.

Calvin Dodge

On Fri, Aug 23, 2024, 12:17 PM Ben Sherman @.***> wrote:

If it's a directive in the job script, you should be able to add it via clusterOptions:

// nextflow.config process.clusterOptions = '-S /bin/bash'

We could potentially add it to Nextflow here: https://github.com/nextflow-io/nextflow/blob/5a37e6177f7a0e02b2af922768a0df5984b07b7b/modules/nextflow/src/main/groovy/nextflow/executor/SgeExecutor.groovy#L39-L88

The UGE and SGE executors are identical under the hood. As long as this -S option is the same across all Grid Engine variants it should be fine, but I would try the clusterOptions approach first and see if it works for you.

— Reply to this email directly, view it on GitHub https://github.com/nextflow-io/nextflow/issues/5254#issuecomment-2307497723, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEWSED26G6YKRK2AFWVLXOLZS5VB5AVCNFSM6AAAAABNAROEWOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMBXGQ4TONZSGM . You are receiving this because you authored the thread.Message ID: @.***>