bcbio / bcbio-nextgen

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis
https://bcbio-nextgen.readthedocs.io
MIT License
985 stars 353 forks source link

ipython config problem? #1285

Closed tombair closed 7 years ago

tombair commented 8 years ago

I have a bcbio installation that works well when I qsub to the queue via a script with the -n option only. But when I submit with the -t ipython option it works but fails when it checks the samtools version. Inserted some code and found that which samtools called within versioncheck.py is /usr/bin/samtools, however in the shell it is which samtools /Shared/Bioinformatics/data/bcbio/bin/samtools. The latter being correct. I am guessing this is due to a bad setup on the config for ipython in that it doesn't automatically see my environment variables. Could you point me in the right direction to resolve?

roryk commented 8 years ago

Sorry for the troubles Tom, it looks like /usr/bin is prepended to your path when running with IPython. Your PATH should be getting passed along with the engines and the controller files, if you set your path to:

export PATH=/Shared/Bioinformatics/data/bcbio/bin/:$PATH

does it still pick up the wrong samtools?

We have a utility function here: https://github.com/chapmanb/bcbio-nextgen/blob/7e21581c6ce898aae62a552ac658ff095a5c92c6/bcbio/pipeline/config_utils.py#L190 that does the checking, but defaults to looking in the path. I swapped the order of this function to prefer the bcbio location first, and then expand to the rest of the system and replaced anywhere where we weren't using it that I could find for samtools to use that function. I put a dummy samtools in my PATH and it uses the correct one running some of the tests. This won't solve the actual problem which is SGE not respecting your path though, if that is what is happening.

tombair commented 8 years ago

Sorry to leave this hanging got pulled off to another project.

I am still getting the error. To help diagnose I inserted this into the versioncheck.py script just before the popen to check samtools:

import os logger.info("PATH is %s" %(os.environ['PATH']))

For the ipython job it outputs this and then returns the samtools error [2016-03-29T16:01Z] neon-compute-6-11.local: Testing minimum versions of installed programs [2016-03-29T16:01Z] neon-compute-6-11.local: PATH is /usr/gnu/bin:/usr/bin:/usr/sbin:/sbin [2016-03-29T16:01Z] neon-compute-6-11.local: Unexpected error

When I start using a qsub job and just the -n argument I get:

[2016-03-29T17:25Z] Testing minimum versions of installed programs [2016-03-29T17:25Z] PATH is /Shared/Bioinformatics/data/bcbio/bin:/state/partition1/sge_jobs/4886738.1.UI:/Shared/Bioinformatics/data/bcbio/bin:/Shared/Bioinformatics/data/bcbio/bin:/opt/pandoc/1.15.2/bin:/opt/R/3.2.1/bin:/opt/intel/composer_xe_2015.3.187/bin/intel64:/opt/intel/composer_xe_2015.3.187/debugger/gdb/intel64_mic/bin:/opt/Python-2.7/bin:/usr/lib64/qt-3.3/bin:/opt/modules/Modules/3.2.10/bin:/usr/local/bin:/bin:/usr/bin:/opt/eclipse:/opt/ganglia/bin:/opt/ganglia/sbin:/opt/rocks/bin:/opt/rocks/sbin:/opt/gridengine/bin/linux-x64:/usr/local/sbin:/usr/sbin:/sbin:/Users/tbair/bin:/Users/tbair/bin [2016-03-29T17:25Z] Timing: alignment preparation

my $PATH is: /Shared/Bioinformatics/data/bcbio/bin:/Shared/Bioinformatics/data/bcbio/bin:/opt/pandoc/1.15.2/bin:/opt/R/3.2.1/bin:/opt/intel/composer_xe_2015.3.187/bin/intel64:/opt/intel/composer_xe_2015.3.187/debugger/gdb/intel64_mic/bin:/opt/Python-2.7/bin:/usr/lib64/qt-3.3/bin:/opt/modules/Modules/3.2.10/bin:/usr/local/bin:/bin:/usr/bin:/opt/eclipse:/opt/ganglia/bin:/opt/ganglia/sbin:/opt/rocks/bin:/opt/rocks/sbin:/opt/gridengine/bin/linux-x64:/usr/local/sbin:/usr/sbin:/sbin:/Users/tbair/bin

This is all with the updated config_utils.py.

I think this is probably something I am doing wrong with ipython configuration maybe?

chapmanb commented 8 years ago

Tom; Sorry to be slow in responding to this. It sounds like your cluster compute environment in unsetting your PATH as part of job submission. I'd recommend the same as Rory, setting the PATH explicitly in the submission script to point at bcbio. Does that help resolve the problem? It might be worth checking with your local cluster folks about the recommended way to pass environmental variables from the submission script so they don't get reset. Hope this helps.

lpantano commented 7 years ago

Hi @tombair

I am closing this because it seems and old issue. Please, feel free to open it again if you still have issues with this and trying to solve it.

cheers