serverhorror commented 9 years ago

Hello,

we just found a situation where Bpipe reproducibly hangs and basically deadlocks.

minimalDeadLock = {
  exec "exit 1",false
}
realisticDeadLockExample = {
 exec """
    set -x # to see the debug output
    set -e # immediately stop script execution in case of error
    /bin/false
    echo This will never be seen
  """, false
}
deadLockWorkaround = {
  exec "(exit 1)",false
}
Bpipe.run { realisticDeadLockExample  + minimalDeadLock + deadLockWorkaround }

bpipe.config:

executor="sge"

What happens is that Bpipe creates a script to be submitted to SGE (and possibly other PBS systems) that looks like the following:

#!/bin/sh
    exit 1
result=$?
echo -n $result > .bpipe/commandtmp/8/cmd.exit
exit $result

Because the script exits before the exit file can be written Bpipe never detects that the script has exited and will hang indefinitely. The problem is that it is now impossible to use existing scripts that have a set -e somewhere in them because they will simply deadlock the whole execution.

A workaround to get around that while this bug is fixed is to wrap all calls within exec in a subshell as can be seen in the deadLockWorkaround statement.

I suggest to choose a different path than to inject arbitrary code in a script.

tucano commented 9 years ago

This are the affected line:

https://github.com/ssadedin/bpipe/blob/master/src/main/groovy/bpipe/executor/SgeCommandExecutor.groovy#L159-L161

To me the best solutions is to use a groovy template, otherwise we can use a bash script to create the template. (example: https://github.com/ssadedin/bpipe/blob/master/bin/bpipe-torque.sh#L170-L171)

tucano commented 9 years ago

Please try this version of the SGE command executor:

https://github.com/tucano/bpipe

(master branch)

Example of cmd.sh:

#!/bin/sh
#$ -S /bin/sh
#$ -cwd
#$ -N "hello"
#$ -terse
#$ -V
#$ -o .bpipe/commandtmp/1/cmd.out
#$ -e .bpipe/commandtmp/1/cmd.err
#$ -l slots=1

echo "Hello World";

Please note that in this driver you can specify additional options using bpipe.config:

walltime
queue
procs (-l slots or -pe)
memory

See:

https://github.com/tucano/bpipe/blob/master/src/main/groovy/bpipe/executor/SgeCommandExecutor.groovy#L123-L150

tucano commented 9 years ago

To add options to the qsub command "inline" (not in cmd.sh), use the bpipe.config key sge_request_options

Example:

executor="sge"
sge_request_options="-q queue_name"

this append the queue to the qsub request

qsub -q queue_name
``

see also: https://github.com/tucano/bpipe/blob/master/src/main/groovy/bpipe/executor/SgeCommandExecutor.groovy#L109-L111

Cheers

serverhorror commented 9 years ago

I think the fundamental problem is that there is no separation of concerns if you simply append to an SGE script. There is no way for Bpipe to understand that there is a reason that an actual script might exit early (for good reasons), it doesn't matter if that is successfull or not. Bpipe just won't know about that.

Please don't force the #$ -V on users. It will break things in certain environments...

gdevenyi commented 9 years ago

-V is essential for job scripts to inherit a functional environment, particularly when you use something like the "environment-modules" system on most clusters.

serverhorror commented 9 years ago

I think we are digressing :) as this is not the actual topic. Rather that exiting early in scripts will cause Bpipe to wait forever on the cmd.exit file because it never gets created (and it is my fault as I started the -V thing)

Regarding -V: I beg to differ.

At our site we advise users to have their scripts "self contained", meaning to actually use module [purge|load|...] and not to rely on implicit environment variables. And I know at least 2 other sites (outside the organization I'm currently working for) that try to do the same. It also helps to have more reproducible results as there is less questioning in terms of:

OK so you used tool X but:

which version,

which compiler and

which MPI libraries?

It is much easier to reason about the behaviour/results if it is done explicitely. The thing is: I don't know how to deactivate that if it is in the default template, as far as I know one can only activate it but not deactivate it...

Also -V with SGE+bash+defined functions will trigger and old bug (ca. 2009) because the environment file of SGE is in a "KEY=VALUE\n" format (one entry per line) and bash functions are exported in this form:

BASH_FUNC_module()=() {  eval `/usr/bin/modulecmd bash $*`
}

The problem is that on the execution host bash tries to set a variable to this (yes the closing brace is missing):

BASH_FUNC_module()=() {  eval `/usr/bin/modulecmd bash $*`

It doesn't help to declare a function in the bash shorthand form (mind the semicolon before the closing curly braces):

module() { eval `/usr/bin/modulecmd bash $*`; }

The way it is exported in the environment still contains the newline. This leads to messages like:

-bash: module: line 1: syntax error: unexpected end of file
-bash: error importing function definition for `BASH_FUNC_module'

Depending on the configuration of SGE the module functin might still be available because after bash showing the error it might load the initialization files if bash has been added to the login_shells. Then it will somewhat reliably have deterministic behaviour and at least show the error every time and later "overload" the module function with whatever comes from the initialization files.

Without bash in login_shells of SGE it depends on how different users submit different scripts. Some might have customized their shells enough to always be a login shell and some might not have done that. This makes it hard to exchange scripts among users.

serverhorror commented 9 years ago

These 2 seem to be related:

69
142

gdevenyi commented 9 years ago

Agree, #142 is related to your report

serverhorror commented 9 years ago

I just saw this:

https://github.com/ssadedin/bpipe/blob/master/src/main/groovy/bpipe/executor/SgeCommandExecutor.groovy#L175-L180

And I think since Bpipe is waiting for the job to finish anyway why not use the "-sync y" option. This way qsub will stay in the foreground and actually report back the return value from the original script.

There might be some logic required to do that with stuff like Bpipe.run { deadLock * 50 }. But it seems that all that is need is to watch for the qsubs to come back?

(Looked at the code for the first time today so forgive me if that idea is way off).

serverhorror commented 9 years ago

69 seems to be affected by the same side affect. If the script won't exit in the way Bpipe expects it to the out file will never be written and thus it's in the deadlock situation (actually is that deadlock or livelock or starvation?)

https://github.com/ssadedin/bpipe/blob/master/src/main/groovy/bpipe/executor/LsfCommandExecutor.groovy#L118-L120

gdevenyi commented 9 years ago

Wow -sync y seems like an ideal option for bpipe to handle:

       -sync y[es]|n[o]
              Available for qsub.

              -sync y causes qsub to wait for the job to complete before exiting.  If the job completes successfully, qsub's exit code will be that of the completed job.  If the job fails to complete successfully, qsub will print out a error message indicating why the job failed and will have an exit
              code of 1.  If qsub is interrupted, e.g. with CTRL-C, before the job completes, the job will be canceled.
              With the -sync n option, qsub will exit with an exit code of 0 as soon as the job is submitted successfully.  -sync n is default for qsub.
              If -sync y is used in conjunction with -now y, qsub will behave as though only -now y were given until the job has been successfully scheduled, after which time qsub will behave as though only -sync y were given.
              If  -sync  y is used in conjunction with -t n[-m[:i]], qsub will wait for all the job's tasks to complete before exiting.  If all the job's tasks complete successfully, qsub's exit code will be that of the first completed job tasks with a non-zero exit code, or 0 if all job tasks exited
              with an exit code of 0.  If any of the job's tasks fail to complete successfully, qsub will print out an error message indicating why the job task(s) failed and will have an exit code of 1.  If qsub is interrupted, e.g. with CTRL-C, before the job completes, all of the job's tasks  will
              be canceled.

gdevenyi commented 9 years ago

This would also resolve the issue I've run into that jobs submitted by bpipe don't get cancelled when I stop bpipe.

serverhorror commented 9 years ago

So if we get "job template scripts" as suggested in #151 this would basically just solve all 4 issues... (Trusting that http://superuser.com/a/78470 actually works for LSF -- lsf_request_options seems to be enough for LSF. Mind you I have never touched or seen LSF so this is just an assumption!)

Bpipe could stop messing with user code and just watch (and record) the exit code
-V would just be gone as a discussion point
no more "what's with those SGE jobs"
queue in bpipe.config could be handled by sge_request_options no need for an additional parameter that might even be conflicting if sge_request_options="-q this" and queue="that"

Am I missing something?

gdevenyi commented 9 years ago

I have a recommendation for how to handle the "job template scripts", if we make it a groovy variable you could write your own custom one in bpipe.config, this would elimintate the need for adding a new command-line option or bundling a new file.

ssadedin commented 9 years ago

Thanks for all the discussion on this. I have been following along but unfortunately since I don't work with an SGE cluster currently I haven't been able to contribute much.

It sounds like the current consensus is that implementing the script for SGE to run as a template (and perhaps more generally, support templates for the command executors) will address the issue?

My initial reaction to this bug was that Bpipe should place the command to be executed in a subshell, so that a premature exit by the user script will only terminate the subshell and not prematurely end the SGE script. Obviously there are some wider issues that are not addressed by that, so the template comes into the picture too. However currently I'm not clear on what should be in the default template: would the default template simply implement the subshell idea (or something equivalent)? Or is it preferable that the user script be completely externalised (say, written to a temporary file, to be executed from within the template script)?

Happy to implement this if I can understand well enough, or to accept patch from either you if you're up for it.

gdevenyi commented 9 years ago

So, I don't have any idea what you mean by subshell. From my perspective, a batch system takes a command exactly as though it would've been run locally, and submits it to a cluster, with the possible addition of the "job options" to hint/supply resource usage. So the job is

ssadedin / bpipe

Bpipe deadLock situation #156

69

142

69 seems to be affected by the same side affect. If the script won't exit in the way Bpipe expects it to the out file will never be written and thus it's in the deadlock situation (actually is that deadlock or livelock or starvation?)

More detail