superphy / spfy

Spfy: an integrated graph database for real-time prediction of Escherichia coli phenotypes and downstream comparative analyses
https://lfz.corefacility.ca/superphy/grouch/
Apache License 2.0
4 stars 2 forks source link

Update `rgi=3.1.1=` to `rgi-4.0.3` #296

Closed kevinkle closed 6 years ago

kevinkle commented 6 years ago

Looks like v4 fixed the following bug:

    Wrapper for RGI. Note RGI has a bug, namely:
    rgi.py, line 816, in runBlast
        with open(working_directory+"/"+outputFile+".json", 'w') as f:
    What this means is that while absolute files for '-i' are fine, rgi can not handle absolute paths for '-o'. Output files must only be a basename (& are thus outputting in your current directory) otherwise the base rgi call will fail.
    Note: As shown in the above code, RGI will also ignore extensions specified in '-o'

Will have to make some changes

kevinkle commented 6 years ago
 modules.amr.amr.amr('/datastore/2018-06-04-04-23-20-021208-ECI-2644_lcl.fasta') from amr
129911fc-0069-4375-82b5-05b977741dbc
Failed 6 minutes ago
Traceback (most recent call last):
  File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/worker.py", line 700, in perform_job
    rv = job.perform()
  File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/job.py", line 500, in perform
    self._result = self.func(*self.args, **self.kwargs)
  File "./modules/amr/amr.py", line 27, in amr
    '-o', outputname])
  File "/opt/conda/envs/backend/lib/python2.7/subprocess.py", line 168, in call
    return Popen(*popenargs, **kwargs).wait()
  File "/opt/conda/envs/backend/lib/python2.7/subprocess.py", line 390, in __init__
    errread, errwrite)
  File "/opt/conda/envs/backend/lib/python2.7/subprocess.py", line 1024, in _execute_child
    raise child_exception
OSError: [Errno 2] No such file or directory
kevinkle commented 6 years ago

or maybe just a new interface

root@aa6de58fc93e:/app# rgi
usage: rgi <command> [<args>]
            commands are:
               main     Runs rgi application
               tab      Creates a Tab-delimited from rgi results
               parser   Creates categorical .json files RGI wheel visualization. An input .json file containing the RGI results must be input.
               load     Loads CARD database json file
               clean    Removes BLAST databases and temporary files
               galaxy   Galaxy project wrapper
               database Information on installed card database
rgi: error: too few arguments
root@aa6de58fc93e:/app# rgi main
usage: rgi main [-h] -i INPUT_SEQUENCE -o OUTPUT_FILE
                [-t {read,contig,protein,wgs}] [-a {DIAMOND,BLAST}]
                [-n THREADS] [--include_loose] [--local] [--clean] [--debug]
                [--low_quality] [-d {wgs,plasmid,chromosome,NA}] [-v]
rgi main: error: argument -i/--input_sequence is required
kevinkle commented 6 years ago

removing overlapping rgi deps from the webserver env, since the conda update... command will overwrite new versions with older ones.

kevin@kevin-ThinkPad-T510:~$ conda install -c bioconda rgi=4.0.3
Fetching package metadata .................
Solving package specifications: .

Package plan for installation in environment /home/kevin/miniconda2:

The following NEW packages will be INSTALLED:

    biopython:      1.71-py27_0           conda-forge
    diamond:        0.9.21-1              bioconda
    filetype:       1.0.1-py_0            conda-forge
    intel-openmp:   2018.0.3-0
    libgfortran-ng: 7.2.0-hdf63c60_3
    libopenblas:    0.2.20-h9ac9557_7
    mkl:            2018.0.3-1
    mkl_fft:        1.0.2-py27_0          conda-forge
    mkl_random:     1.0.1-py27_0          conda-forge
    numpy:          1.14.3-py27hcd700cb_2
    numpy-base:     1.14.3-py27h0ea5e3f_1
    prodigal:       2.6.3-0               bioconda
    rgi:            4.0.3-py27_0          bioconda

The following packages will be UPDATED:

    conda:          4.3.34-py27_0         conda-forge --> 4.5.4-py27_0 conda-forge

The following packages will be SUPERSEDED by a higher-priority channel:

    conda-env:      2.6.0-h36134e3_1                  --> 2.6.0-0      conda-forge
kevinkle commented 6 years ago
 # Indices check.
            df = pd.read_table(pickled_amr_tsv)
            indices = df.index.values
>           assert set(AMR_INDICES).issubset(set(indices))
E           AssertionError: assert False
E            +  where False = <built-in method issubset of set object at 0x7fbceee796a8>(set([0, 1, 2, 3, 4, 5, ...]))
E            +    where <built-in method issubset of set object at 0x7fbceee796a8> = set(['Best_Hit_ARO', 'CUT_OFF', 'ORF_ID', 'ORIENTATION', 'START', 'STOP']).issubset
E            +      where set(['Best_Hit_ARO', 'CUT_OFF', 'ORF_ID', 'ORIENTATION', 'START', 'STOP']) = set(['ORF_ID', 'START', 'STOP', 'ORIENTATION', 'CUT_OFF', 'Best_Hit_ARO'])
E            +    and   set([0, 1, 2, 3, 4, 5, ...]) = set(array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,\n   ...50,\n       51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67]))
kevinkle commented 6 years ago

Looks like RGI now identifies fewer targets:

E                       AssertionError: assert 1878 == 2007
E                        +  where 1878 = length('/home/travis/build/superphy/spfy/app/tests/ecoli/GCA_001894495.1_ASM189449v1_genomic.fna_rgi.ttl')
E                        +  and   2007 = length('tests/refs/GCA_001894495.1_ASM189449v1_genomic.fna_rgi.ttl')
kevinkle commented 6 years ago

Update looks good as of https://github.com/superphy/spfy/commit/1a862010a85f0ef0a904fc0f1e3fc9c5fdd5132a . Updating tests and pinning env versions now...

kevinkle commented 6 years ago

done in https://github.com/superphy/spfy/pull/305