Copy environment to subprocess

snsansom commented 6 years ago

If cgat is installed in a venv, the location of the "cgat" wrapper script is added to PATH when the venv is sourced.

venvs are typically sourced in the users .bashrc.

.bashrc (and .profile) scripts are not evaluated by non-interactive shells.

The shell opened by subprocess with "shell=True" is non-interactive.

Thus, unless a copy of the "cgat" wrapper is also avaliable on some path not set in .bashrc, execution of jobs locally (i.e. non cluster) that use the cgat wrapper (e.g. cgat csv2db) will fail (cgat: command not found) for venv based installs.

This fixes the issue by copying the environment path to the subprocess shell.

sebastian-luna-valero commented 6 years ago

Thanks, Steve.

We do:

os.environ.update({'BASH_ENV': os.path.join(os.environ['HOME'],'.bashrc')})

in Cluster.py, since Pipeline.py does bash -e submission-script.sh and that also creates a non-interactive shell.

I wonder whether we should be consistent and use the same solution for local jobs?

snsansom commented 6 years ago

Hi Sebastian,

So in Cluster.py I think the situation is queue manger specific.

(1) SGE: (If my understanding is correct) the -V flag copies the submission node's environment? (2) SLURM: documented behaviour is to copy the submission node's environment (3) PBS and torque have no way to copy the environment, so we are using os.environ.update.

Copying the submission node's environment might then be considered preferable because: (1) This is 'reference' behaviour as it is what happens at CGAT (SGE queue) for cluster jobs (2) It seems natural that users should expect execution environments to be configured the same as the submission environment (3) We cannot guarantee that the expected environment will be specified by ".bashrc" - rather than e.g. ".bash_profile" or indeed interactively by the user prior to launching a pipeline...

sebastian-luna-valero commented 6 years ago

Hi Steve,

Sorry for being unclear.

I agree with you and I think your changes are a good idea, but I thought adding the BASH_ENV=.bashrc to the environment for local jobs is a good idea to be consistent with the way we run jobs with Cluster.py.

To be more precise, I have committed a couple of changes to this branch so you can have a look and test it on your end. Of course I am happy to revert my commit if you are not happy with it or foresee potential problems.

Best regards, Sebastian

sebastian-luna-valero commented 6 years ago

Sorry, forgot to add a reference to the line where BASH_ENV=.bashrc is set in Cluster.py for all job schedulers: https://github.com/CGATOxford/CGATPipelines/blob/master/CGATPipelines/Pipeline/Cluster.py#L38

snsansom commented 6 years ago

Hi Sebastian.

Ok sorry - now I see what you are saying.

So for e.g. SLURM/SGE cluster jobs the sequence is: (1) define environment from bashrc (2) pass through environmental variables from calling shell (e.g. -V)

Presumably this is the desired sequence?, i.e. to first read the .bashrc and then to update from the actual environment (the order is obviously important!)

While e.g. for Torque the sequence is currently: (1) set environment from bashrc (2) set environment from os.environment (3) update environment from bashrc

Seems that for Torque we should just be doing this: (1) set environment from bashrc (2) update environment from os.environment

Note that for local jobs, .bashrc was already executed by the calling interactive shell (when it started), and the user may have subsequently modified the environment... So for consistency I think we should either (a) just copy os.environ or (b) (i) make an new environ from .bashrc and (ii) update new enivronment from os.environ? does that make sense?

sebastian-luna-valero commented 6 years ago

Hi Steve,

I would say:

the sequence for SLURM/SGE is correct, otherwise we would have noticed already
other job schedulers: I agree the second sequence makes more sense but PBS/Torque users have not complained about its current config and I just prefer to leave it as it is.
for local jobs, after my last commit the sequence should behave as explained in (b), and I think that's the better approach for consistency

Also, would it be possible to compare the output of the local jobs on your end before and after the last commit to make sure we don't break anything before merging into the master?

Best regards, Sebastian

snsansom commented 6 years ago

Hi Sebastian,

Happy to leave the PBS/Torque code alone.

Edit - Ok assuming subprocess.Popen evaluates its "env" argument after the shell evaluates the "BASH_ENV" reference (seems logical, can't find any documentation, but a simple test seems to confirm it is the case), this looks good.

Please see additional commit to also fix in Control.py

Many thanks, Steve

P.S. I can't run local jobs without these commits!

sebastian-luna-valero commented 6 years ago

Many thanks, Steve.

All looking good to me so I will merge now.

Best regards, Sebastian

CGATOxford / CGATPipelines

Copy environment to subprocess #411