conda env not being loaded on jobs/rules using sge-gs cluster pfofile.

MorganHamm commented 1 year ago

When I run the test case using the sge-gs profile, execution of the ccs_zmw rule fails. here is the command I try to run:

snakemake \
  --profile profile/sge-gs # sets up where and how jobs are submitted \
  --config \
    env="fiberseq-smk" # sets the conda env for the jobs, always the same \
    test=.test/ccs.bam # path to the ccs reads with HiFi kinetics, and the key sets the sample name \
    ref=.test/ref.fa # reference to align results to

the following is the snakemake error:

Submitted job 8 with external jobid '292379701'.
[Tue Jan 17 10:23:35 2023]
Error in rule ccs_zmws:
    jobid: 8
    input: .test/ccs.bam
    output: temp/test/ccs_zmws/ccs_zmws.txt
    log: logs/test/ccs_zmws/ccs_zmws.log (check log file(s) for error details)
    conda-env: fiberseq-smk
    shell:

        (samtools view -F 2304 -@ 8 .test/ccs.bam             | cut -f 1             | sed 's#/ccs##g'             | sed 's#/$##g'             | sed 's#.*/##g'             | sort -g -S 1G --parallel 8             > temp/test/ccs_zmws/ccs_zmws.txt         ) 2> logs/test/ccs_zmws/ccs_zmws.log
        #bamsieve --show-zmws .test/ccs.bam > temp/test/ccs_zmws/ccs_zmws.txt 2> logs/test/ccs_zmws/ccs_zmws.log

        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
    cluster_jobid: 292379701
Logfile logs/test/ccs_zmws/ccs_zmws.log not found.

Error executing rule ccs_zmws on cluster (jobid: 8, external: 292379701, jobscript: /net/trapnell/vol1/home/mhamm/fiber_seq/nobackup/fiberseq-smk/.snakemake/tmp.naew5fab/snakejob.ccs_zmws.8.sh). For error details see the cluster log and the log files of the involved rule(s).
Trying to restart job 8.

In the log file I get errors like this:

Exception in file /net/trapnell/vol1/home/mhamm/fiber_seq/nobackup/fiberseq-smk/workflow/rules/common.smk, line 100:
Cannot find hck in PATH. Please see the README for installation instructions.
  File "/net/trapnell/vol1/home/mhamm/fiber_seq/nobackup/fiberseq-smk/workflow/Snakefile", line 48, in <module>
  File "/net/trapnell/vol1/home/mhamm/fiber_seq/nobackup/fiberseq-smk/workflow/rules/common.smk", line 160, in check_for_tools
  File "/net/trapnell/vol1/home/mhamm/fiber_seq/nobackup/fiberseq-smk/workflow/rules/common.smk", line 100, in is_tool

however when I type which hck from the terminal with fiberseq-smk environment active, I get a path: ~/miniconda3/envs/fiberseq-smk/bin/hck So it seems like the conda environment is not being activated when the job is running.

Workaround

I can get the test case to run successfully by adding conda activate fiberseq-smk to my .bash_profile this confirms the error is caused by the conda environment not being loaded.

Other details

$ conda --version
conda 22.11.1

I also found the temp script used to run one of the jobs that failed. I've added line separators to make it more readable:

#!/bin/sh
# properties = {"type": "single", "rule": "ccs_zmws", "local": false, "input": [".test/ccs.bam"], "output": ["temp/test/ccs_zmws/ccs_zmws.txt"], "wildcards": {"sm": "test"}, "params": {}, "log": ["logs/test/ccs_zmws/ccs_zmws.log"], "threads": 1, "resources": {"mem_mb": 4096, "disk_mb": 4096, "tmpdir": "/tmp/288852750.1.trapnell-login.q", "time": 40}, "jobid": 7, "cluster": {}}
cd '/net/trapnell/vol1/home/mhamm/fiber_seq/nobackup/fiberseq-smk' && \
    /net/trapnell/vol1/home/mhamm/miniconda3/envs/fiberseq-smk/bin/python3.9 -m snakemake \
        --snakefile '/net/trapnell/vol1/home/mhamm/fiber_seq/nobackup/fiberseq-smk/workflow/Snakefile' \
        'temp/test/ccs_zmws/ccs_zmws.txt' --allowed-rules 'ccs_zmws' --cores 'all' --attempt 1 --force-use-threads  \
        --wait-for-files '/net/trapnell/vol1/home/mhamm/fiber_seq/nobackup/fiberseq-smk/.snakemake/tmp.wkn0mj9y' \
        '.test/ccs.bam' --force --keep-target-files --keep-remote --max-inventory-time 0 --nocolor --notemp --no-hooks \
        --nolock --ignore-incomplete --rerun-triggers 'mtime' --skip-script-cleanup  --use-conda  --conda-frontend 'mamba' \
        --conda-base-path '/net/trapnell/vol1/home/mhamm/miniconda3' --wrapper-prefix 'https://github.com/snakemake/snakemake-wrappers/raw/' \
        --config 'env=fiberseq-smk' 'test=.test/subreads.bam' 'ccs=.test/ccs.bam' 'ref=.test/ref.fa' --printshellcmds  \
        --latency-wait 30 --scheduler 'ilp' --scheduler-solver-path '/net/trapnell/vol1/home/mhamm/miniconda3/envs/fiberseq-smk/bin' \
        --default-resources 'mem_mb=4096' 'disk_mb=4096' 'tmpdir=system_tmpdir' 'time=40' --mode 2 && \
    touch '/net/trapnell/vol1/home/mhamm/fiber_seq/nobackup/fiberseq-smk/.snakemake/tmp.wkn0mj9y/7.jobfinished' || \
    (touch '/net/trapnell/vol1/home/mhamm/fiber_seq/nobackup/fiberseq-smk/.snakemake/tmp.wkn0mj9y/7.jobfailed'; exit 1)

I tried re-installing conda and mamba from scratch and creating a new environment from the yaml file.

mrvollger commented 1 year ago

Hi Morgan,

I had an idea of something that might be causing issues for you on the GS cluster. Can you pull this commit ^ from the dev branch and give it a try?

MorganHamm commented 1 year ago

Pulling that commit from dev seems to have fixed the issue. I do not need to activate the env from my .bash_profile anymore. Thanks!

fiberseq / fiberseq-smk

conda env not being loaded on jobs/rules using sge-gs cluster pfofile. #13

Workaround

Other details