Open d-callan opened 3 months ago
possibly crazy question though.. wondering if there is a way i can work around this in the meantime of a fix? im kind of stuck as things are.
as i investigate more, it seems like this is due to some odd configuration on my cluster. i cant run nextflow directly on the head node, where the correct lsf.conf exists. and for whatever reason, the lsf.conf file on the worker nodes is not consistent w the head node. ive tried to ask the admins about it, and they are.... something less than helpful. i think id like to amend this ticket to a feature request:
to be able to explicitly override this unit
This LSF config setting is read here: https://github.com/nextflow-io/nextflow/blob/2fb5bc07f2ad1309c9743b8675bb8003892e3eb7/modules/nextflow/src/main/groovy/nextflow/executor/LsfExecutor.groovy#L315-L320
And the memory options are defined here: https://github.com/nextflow-io/nextflow/blob/2fb5bc07f2ad1309c9743b8675bb8003892e3eb7/modules/nextflow/src/main/groovy/nextflow/executor/LsfExecutor.groovy#L92-L103
So you can see how the various config options affect the final submit options. Maybe you can use the executor.perJobMemLimit
or executor.perTaskReserve
options to get what you need
thanks @bentsherman for the info. i had another thought recently.. what do you think of explicitly adding units to the submission string? so that nextflow produces something like bsub -M 50000KB
rather than bsub -M 50000
? if doable, that seems like it should make this more robust, make my problem go away, and add clarity without changing existing behavior/ features?
I didn't realize that was an option. It would make things much simpler. Can a unit be specified for all of those memory settings?
hmm. good question. ive just now gone and tried to ask for an interactive node on my cluster like bsub -M 4GB -R "select[mem>=8GB] rusage[mem=8GB]" -Is bash
and nothing screamed at me or caught fire.. so that seems promising.
Okay I see it is documented here: https://www.ibm.com/docs/en/spectrum-lsf/10.1.0?topic=requirements-resource-requirement-strings#vnmbvn__title__3
Assuming this syntax has been supported for a while, it should be fine for Nextflow to use it. I will draft a PR
Bug report
Expected behavior and actual behavior
Jobs submit on an LSF cluster should respect the value for
LSF_UNIT_FOR_LIMITS
inlsf.conf
, per #1124 .. However, running on a cluster where this unit is set to MB, for a task asking for 80 MB, sees a header in.command.run
files like the following:Steps to reproduce the problem
On an LSF cluster with a non-default setting for
LSF_UNIT_FOR_LIMITS
, i attempted to run an nf-core pipeline..Program output
The cluster fails to start jobs, saying ive requested more resources than the queue allows.
Environment