ncbi / fcs

Foreign Contamination Screening caller scripts and documentation
Other
101 stars 13 forks source link

Can run fcs.py interactively, but in a batch job keep getting: "No such file or directory: 'docker'" #55

Closed SchwarzEM closed 11 months ago

SchwarzEM commented 11 months ago

Describe the bug

I am able to run fcs.py on my system in naive interactive line-command mode, but when I try to run exactly the same commands in SLURM batch jobs, fcs.py keeps crashing with an error message. This is very bad because I need to be able to run long jobs in batch mode to use this software for real purposes at all.

To Reproduce

First, to prove to myself that I could run fcs.py on my system as long as I worked from an interactive command-line, I did the following:

mkdir $PROJECT/src/NCBI_FCS ;
cd $PROJECT/src/NCBI_FCS ;

curl -LO https://github.com/ncbi/fcs/raw/main/dist/fcs.py ;
curl https://ftp.ncbi.nlm.nih.gov/genomes/TOOLS/FCS/releases/latest/fcs-gx.sif -Lo fcs-gx.sif ;

SOURCE_DB_MANIFEST="https://ftp.ncbi.nlm.nih.gov/genomes/TOOLS/FCS/database/test-only/test-only.manifest" ;
LOCAL_DB="/ocean/projects/mcb190015p/schwarze/src/NCBI_FCS" ;
python3 fcs.py db get --mft "$SOURCE_DB_MANIFEST" --dir "$LOCAL_DB/test-only" ;

[this took a bit of time to do in naive interactive mode, but, it did completely and successfully run]

# Get a test database:
curl -LO https://github.com/ncbi/fcs/raw/main/examples/fcsgx_test.fa.gz ;

# Look at what I got downloaded:
ls -l $PROJECT/src/NCBI_FCS/test-only ;

-rw-r--r-- 1 schwarze mcb190015p    8226897 Oct 16 17:09 test-only.blast_div.tsv.gz
-rw-r--r-- 1 schwarze mcb190015p 4522535378 Oct 16 17:10 test-only.gxi
-rw-r--r-- 1 schwarze mcb190015p   70841083 Oct 16 17:09 test-only.gxs
-rw-r--r-- 1 schwarze mcb190015p       1292 Oct 16 17:11 test-only.manifest
-rw-r--r-- 1 schwarze mcb190015p         56 Oct 16 17:09 test-only.meta.jsonl
-rw-r--r-- 1 schwarze mcb190015p      21847 Oct 16 17:09 test-only.seq_info.tsv.gz
-rw-r--r-- 1 schwarze mcb190015p       6063 Oct 16 17:09 test-only.taxa.tsv

# Verify functionality by using the small 'test-only' database.
# No, NCBI, I'm not ROOT and I don't have RAM disking at will:
GXDB_LOC=$PROJECT/src/NCBI_FCS/test-only ;

# Try running this interactively: 
python3 ./fcs.py screen genome --fasta ./fcsgx_test.fa.gz --out-dir ./gx_out/ --gx-db "$GXDB_LOC/test-only" --tax-id 6973 ;

[I have no way to tell if this is what I should be seeing, but, I *am* getting a coherent-looking set of results, so I'll assume that these results represent a successful test run:

--------------------------------------------------------------------------------------------------
Warning: Asserted tax-div 'anml:insects' is well-represented in db, but absent from inferred-primary-divs.
This means that either asserted tax-div is incorrect, or the input is predominantly contamination.
Will trust the asserted div and treat inferred-primary-divs as contaminants.
--------------------------------------------------------------------------------------------------

Asserted div               : anml:insects
Inferred primary-divs      : ['prok:CFB group bacteria']
Corrected primary-divs     : ['anml:insects']
Putative contaminant divs  : ['prok:CFB group bacteria']
Aggregate coverage         : 51%
Minimum contam. coverage   : 30%

--------------------------------------------------------------------

fcs_gx_report.txt contamination summary:
----------------------------------------
                                seqs      bases
                               ----- ----------
TOTAL                            243   27170378
-----                          ----- ----------
prok:CFB group bacteria          243   27170378

--------------------------------------------------------------------

fcs_gx_report.txt action summary:
---------------------------------
                                seqs      bases
                               ----- ----------
TOTAL                            243   27170378
-----                          ----- ----------
EXCLUDE                          214   25795430
REVIEW                            29    1374948

--------------------------------------------------------------------

# Judging by various results, this does seem to have worked.

ls -lt ;

drwxr-xr-x 2 schwarze mcb190015p      4096 Oct 16 17:37 gx_out
-rw-r--r-- 1 schwarze mcb190015p   8973020 Oct 16 17:16 fcsgx_test.fa.gz
drwxr-xr-x 2 schwarze mcb190015p      4096 Oct 16 17:11 test-only
-rw-r--r-- 1 schwarze mcb190015p 199995392 Oct 16 17:09 fcs-gx.sif
-rw-r--r-- 1 schwarze mcb190015p     17379 Oct 16 17:09 fcs.py

ls -l gx_out ;

-rw-r--r-- 1 schwarze mcb190015p  21614 Oct 16 17:37 fcsgx_test.fa.6973.fcs_gx_report.txt
-rw-r--r-- 1 schwarze mcb190015p 131726 Oct 16 17:37 fcsgx_test.fa.6973.taxonomy.rpt

wc -l gx_out/fcsgx_test.fa.6973.fcs_gx_report.txt ;

245 gx_out/fcsgx_test.fa.6973.fcs_gx_report.txt

# Next, try doing exactly what I did above, but *this* time in a batch job.

mv -i gx_out manual_gx_out_dir ;
mamba deactivate ;

sbatch job_FCS_test_2023.10.16.01.sh ;

[Contents of job_FCS_test_2023.10.16.01.sh:]
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --partition=RM-shared
#SBATCH --time=008:00:00
#SBATCH --ntasks-per-node=8
#SBATCH --job-name=job_FCS_test_2023.10.16.01.sh
#SBATCH --mail-type=ALL
cd $PROJECT/src/NCBI_FCS ;
source $HOME/.bashrc_mamba ;
. $PROJECT/mambaforge-pypy3/etc/profile.d/mamba.sh ;
mamba activate python_3.7.0 ;
GXDB_LOC=$PROJECT/src/NCBI_FCS/test-only ;
python3 ./fcs.py screen genome --fasta ./fcsgx_test.fa.gz --out-dir ./gx_out/ --gx-db "$GXDB_LOC/test-only" --tax-id 6973 ;
mamba deactivate ;

[This does start to run, but crashes with the following message, and leaves no output files:]

cat slurm-19859569.out ;

Traceback (most recent call last):
  File "./fcs.py", line 480, in <module>
    sys.exit(main())
  File "./fcs.py", line 469, in main
    gx.run()
  File "./fcs.py", line 386, in run
    self.args.func(self)
  File "./fcs.py", line 364, in run_screen_mode
    self.run_gx()
  File "./fcs.py", line 235, in run_gx
    self.safe_exec(docker_args)
  File "./fcs.py", line 150, in safe_exec
    subprocess.run(args, shell=False, check=True, text=True, stdout=sys.stdout, stderr=sys.stderr)
  File "/ocean/projects/mcb190015p/schwarze/mambaforge-pypy3/envs/python_3.7.0/lib/python3.7/subprocess.py", line 453, in run
    with Popen(*popenargs, **kwargs) as process:
  File "/ocean/projects/mcb190015p/schwarze/mambaforge-pypy3/envs/python_3.7.0/lib/python3.7/subprocess.py", line 756, in __init__
    restore_signals, start_new_session)
  File "/ocean/projects/mcb190015p/schwarze/mambaforge-pypy3/envs/python_3.7.0/lib/python3.7/subprocess.py", line 1499, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'docker': 'docker'

[To try to fix this problem in the batch jobs, I manually installed a 'docker' module in the mamba Python environment:]

mamba activate python_3.7.0 ;
pip install docker ;
[installation runs]
mamba deactivate ;

[But when I then reran the above batch script after installing this 'docker' Python module, I still got the same error message as above.]

Software versions (please complete the following information):

Log Files

Here is the entire error message that I got even after --debug:

Traceback (most recent call last):
  File "./fcs.py", line 480, in <module>
    sys.exit(main())
  File "./fcs.py", line 469, in main
    gx.run()
  File "./fcs.py", line 386, in run
    self.args.func(self)
  File "./fcs.py", line 364, in run_screen_mode
    self.run_gx()
  File "./fcs.py", line 235, in run_gx
    self.safe_exec(docker_args)
  File "./fcs.py", line 150, in safe_exec
    subprocess.run(args, shell=False, check=True, text=True, stdout=sys.stdout, stderr=sys.stderr)
  File "/ocean/projects/mcb190015p/schwarze/mambaforge-pypy3/envs/python_3.7.0/lib/python3.7/subprocess.py", line 453, in run
    with Popen(*popenargs, **kwargs) as process:
  File "/ocean/projects/mcb190015p/schwarze/mambaforge-pypy3/envs/python_3.7.0/lib/python3.7/subprocess.py", line 756, in __init__
    restore_signals, start_new_session)
  File "/ocean/projects/mcb190015p/schwarze/mambaforge-pypy3/envs/python_3.7.0/lib/python3.7/subprocess.py", line 1499, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'docker': 'docker'
SchwarzEM commented 11 months ago

And then, after being stuck with all of that, I finally found a subtle bugfix in issue #48 that actually solved my problems while running in a batch job!

Specifically, adding --image fcs-gx.sif to the command line immediately after python3 fcs.py made the command work fine in a batch job, whereas previously it had been crashing. In full detail:

mamba activate python_3.7.0 ;
GXDB_LOC=$PROJECT/src/NCBI_FCS/test-only ;
python3 fcs.py --image fcs-gx.sif screen genome --debug --fasta ./fcsgx_test.fa.gz --out-dir ./gx_out/ --gx-db "$GXDB_LOC/test-only" --tax-id 6973 ;

I am still mystified about why it was necessary for me to do this in a batch job but not in an interactive line command; but, I'm glad it at least seems to be working now.

etvedte commented 11 months ago

Hello,

We haven't tested docker installs in mamba environments.

In any case, glad you got it to run. What happened here is that you didn't explicitly set the Singularity image file as an environment variable export FCS_DEFAULT_IMAGE=fcs-gx.sif but was then resolved when you set it with the --image parameter in the fcs.py command. It's in the documentation but easy to miss. Seeing as you got it to work with the singularity image I would recommend just proceeding with that. https://github.com/ncbi/fcs/wiki/FCS-GX#a-retrieve-the-required-files