Closed snsansom closed 6 years ago
Thanks, Steve.
We do:
os.environ.update({'BASH_ENV': os.path.join(os.environ['HOME'],'.bashrc')})
in Cluster.py
, since Pipeline.py
does bash -e submission-script.sh
and that also creates a non-interactive shell.
I wonder whether we should be consistent and use the same solution for local jobs?
Hi Sebastian,
So in Cluster.py I think the situation is queue manger specific.
(1) SGE: (If my understanding is correct) the -V flag copies the submission node's environment? (2) SLURM: documented behaviour is to copy the submission node's environment (3) PBS and torque have no way to copy the environment, so we are using os.environ.update.
Copying the submission node's environment might then be considered preferable because: (1) This is 'reference' behaviour as it is what happens at CGAT (SGE queue) for cluster jobs (2) It seems natural that users should expect execution environments to be configured the same as the submission environment (3) We cannot guarantee that the expected environment will be specified by ".bashrc" - rather than e.g. ".bash_profile" or indeed interactively by the user prior to launching a pipeline...
Hi Steve,
Sorry for being unclear.
I agree with you and I think your changes are a good idea, but I thought adding the BASH_ENV=.bashrc
to the environment for local jobs is a good idea to be consistent with the way we run jobs with Cluster.py
.
To be more precise, I have committed a couple of changes to this branch so you can have a look and test it on your end. Of course I am happy to revert my commit if you are not happy with it or foresee potential problems.
Best regards, Sebastian
Sorry, forgot to add a reference to the line where BASH_ENV=.bashrc
is set in Cluster.py
for all job schedulers:
https://github.com/CGATOxford/CGATPipelines/blob/master/CGATPipelines/Pipeline/Cluster.py#L38
Hi Sebastian.
Ok sorry - now I see what you are saying.
So for e.g. SLURM/SGE cluster jobs the sequence is: (1) define environment from bashrc (2) pass through environmental variables from calling shell (e.g. -V)
Presumably this is the desired sequence?, i.e. to first read the .bashrc and then to update from the actual environment (the order is obviously important!)
While e.g. for Torque the sequence is currently: (1) set environment from bashrc (2) set environment from os.environment (3) update environment from bashrc
Seems that for Torque we should just be doing this: (1) set environment from bashrc (2) update environment from os.environment
Note that for local jobs, .bashrc was already executed by the calling interactive shell (when it started), and the user may have subsequently modified the environment... So for consistency I think we should either (a) just copy os.environ or (b) (i) make an new environ from .bashrc and (ii) update new enivronment from os.environ? does that make sense?
Hi Steve,
I would say:
Also, would it be possible to compare the output of the local jobs on your end before and after the last commit to make sure we don't break anything before merging into the master?
Best regards, Sebastian
Hi Sebastian,
Happy to leave the PBS/Torque code alone.
Edit - Ok assuming subprocess.Popen evaluates its "env" argument after the shell evaluates the "BASH_ENV" reference (seems logical, can't find any documentation, but a simple test seems to confirm it is the case), this looks good.
Please see additional commit to also fix in Control.py
Many thanks, Steve
P.S. I can't run local jobs without these commits!
Many thanks, Steve.
All looking good to me so I will merge now.
Best regards, Sebastian
If cgat is installed in a venv, the location of the "cgat" wrapper script is added to PATH when the venv is sourced.
venvs are typically sourced in the users .bashrc.
.bashrc (and .profile) scripts are not evaluated by non-interactive shells.
The shell opened by subprocess with "shell=True" is non-interactive.
Thus, unless a copy of the "cgat" wrapper is also avaliable on some path not set in .bashrc, execution of jobs locally (i.e. non cluster) that use the cgat wrapper (e.g. cgat csv2db) will fail (cgat: command not found) for venv based installs.
This fixes the issue by copying the environment path to the subprocess shell.