sandialabs / TIGER

Target / Integrative Genetic Element Retriever: precisely maps IGEs (a defined type of genomic island) in bacterial and archaeal genomes; package also includes orthogonal program Islander
Other
10 stars 3 forks source link

islander.pl -> cannot open file `rfind.gff' #3

Open lxsteiner opened 4 years ago

lxsteiner commented 4 years ago

I'm getting several errors when running the first step with islander.pl on the testdata and I'm not sure if it's normal because there's also a lot of "completed successfully".

Notable ones:

awk: fatal: cannot open file `rfind.gff' for reading (No such file or directory)
Error: Unable to open file rfam.gff. Exiting.
Error: Unable to open file tmrna.gff. Exiting.
Error: Unable to open file trna.gff. Exiting.
Error: Unable to open file trna.gff. Exiting.
Command 'prokka --rfam --prefix protein --locustag genome --gcode 11 --kingdom Bacteria --cpus 1 --rnammer --notrna --outdir ./ --force --quiet --locustag Eco837 ../genome.fa' failed with error message 512
***** ERROR: Requested column 4, but database file stdin only has fields 1 - 0.

please see full islander.log in attachment. It finishes with:

# CPU time: 10.20u 0.38s 00:00:10.58 Elapsed: 00:00:10.58
//
[ok]
Command 'cmscan -o /dev/null --cpu 0 --tblout tmrna.tbl --oskip --fmt 2 /home/leon/tools/TIGER/db/cm/tmrna.cm /home/leon/tools/TIGER/testdata/genome.fa' succeeded

so not sure if this is ok or not.

I installed all dependencies with conda to match your exact versions:

# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       1_gnu    conda-forge
aragorn                   1.2.38               h516909a_3    bioconda
barrnap                   0.9                           3    bioconda
bedtools                  2.27.1               he513fc3_4    bioconda
blast                     2.6.0               boost1.64_2    bioconda
boost                     1.64.0                   py36_4    conda-forge
boost-cpp                 1.64.0                        1    conda-forge
bzip2                     1.0.8                h516909a_3    conda-forge
ca-certificates           2020.6.20            hecda079_0    conda-forge
certifi                   2020.6.20        py36h9f0ad1d_0    conda-forge
expat                     2.2.9                he1b5a44_2    conda-forge
hmmer                     3.3                  he1b5a44_1    bioconda
icu                       58.2              hf484d3e_1000    conda-forge
infernal                  1.1.2                h516909a_3    bioconda
ld_impl_linux-64          2.34                 hc38a660_9    conda-forge
libblas                   3.8.0               17_openblas    conda-forge
libcblas                  3.8.0               17_openblas    conda-forge
libffi                    3.2.1             he1b5a44_1007    conda-forge
libgcc                    7.2.0                h69d50b8_2    conda-forge
libgcc-ng                 9.3.0               h24d8f2e_16    conda-forge
libgfortran-ng            7.5.0               hdf63c60_16    conda-forge
libgomp                   9.3.0               h24d8f2e_16    conda-forge
libiconv                  1.16                 h516909a_0    conda-forge
libidn11                  1.34                 h1cef754_0    conda-forge
liblapack                 3.8.0               17_openblas    conda-forge
libopenblas               0.3.10          pthreads_hb3c22a3_4    conda-forge
libstdcxx-ng              9.3.0               hdf63c60_16    conda-forge
minced                    0.4.2                         0    bioconda
ncurses                   6.2                  he1b5a44_1    conda-forge
numpy                     1.19.1           py36h3849536_2    conda-forge
openjdk                   11.0.1            h516909a_1016    conda-forge
openssl                   1.1.1g               h516909a_1    conda-forge
parallel                  20160622                      1    bioconda
perl                      5.26.2            h516909a_1006    conda-forge
perl-app-cpanminus        1.7044                  pl526_1    bioconda
perl-bioperl              1.6.924                       4    bioconda
perl-carp                 1.38                    pl526_3    bioconda
perl-constant             1.33                    pl526_1    bioconda
perl-exporter             5.72                    pl526_1    bioconda
perl-extutils-makemaker   7.36                    pl526_1    bioconda
perl-file-path            2.16                    pl526_0    bioconda
perl-file-temp            0.2304                  pl526_2    bioconda
perl-ipc-run3             0.048                   pl526_0    bioconda
perl-parent               0.236                   pl526_1    bioconda
perl-threaded             5.26.0                        0    bioconda
perl-time-hires           1.9760          pl526h14c3975_1    bioconda
perl-xml-namespacesupport 1.12                    pl526_0    bioconda
perl-xml-parser           2.44            pl526h4e0c4b3_7    bioconda
perl-xml-sax              1.02                    pl526_0    bioconda
perl-xml-sax-base         1.09                    pl526_0    bioconda
perl-xml-sax-expat        0.51                    pl526_3    bioconda
perl-xml-simple           2.25                    pl526_1    bioconda
perl-xsloader             0.24                    pl526_0    bioconda
perl-yaml                 1.29                    pl526_0    bioconda
pip                       20.2.2                     py_0    conda-forge
prodigal                  2.6.3                h516909a_2    bioconda
prokka                    1.11                          0    bioconda
python                    3.6.11          h4d41432_2_cpython    conda-forge
python_abi                3.6                     1_cp36m    conda-forge
readline                  8.0                  he28a2e2_2    conda-forge
setuptools                49.6.0           py36h9f0ad1d_0    conda-forge
sqlite                    3.33.0               h4cf870e_0    conda-forge
tbl2asn                   25.7                          0    bioconda
tk                        8.6.10               hed695b0_0    conda-forge
trnascan-se               2.0.3           pl526h14c3975_0    bioconda
wheel                     0.35.1             pyh9f0ad1d_0    conda-forge
xz                        5.2.5                h516909a_1    conda-forge
zlib                      1.2.11            h516909a_1007    conda-forge

The only difference is tRNAscan-SE which can only be 2.0-1 or 2.0.3, because 2.0.2 is not available.

Thanks, Leon

kpwilliams commented 4 years ago

test.txt

Attached is what directories should look like after a successful run:

kpwilliams commented 4 years ago

Those error messages are bad, meaning none/little of the tRNA/tmRNA-finding is working; Islander would then be unable to produce any results. It might help the troubleshooting if you post a similar file of directory structure to what I posted above. (Even deeper, into the subdirectories of the first "trna" would be better.)

lxsteiner commented 4 years ago

Thanks for providing an output example.

I've been still fighting with getting all dependencies to work. Here are the latest issues I hit with islander.pl

Error 1

Possible attempt to separate words with commas at /home/leon/tools/TIGER/bin/rfind.pl line 22.

which does seem to be an issue in the script because line 22 in rfind.pl looks like this:

     20 #my @blastCmd = (qw/blastn -soft_masking false -lcase_masking -db/, $dbfiles, qw/-outfmt 6/);  # Run3 wants an array of its command words!
     21 my @blastCmd = (qw/blastn -dust no -soft_masking false -lcase_masking -db/, $dbfiles, qw/-outfmt 6/);  # Run3 wants an array of its command words!
     22 my @mergeCmd = qw/bedtools merge -s -c 4,6 -o collapse -i stdin/;
     23 #warn scalar(keys %inseqs), " inseqs, first of size ", length($inseqs{(keys %inseqs)[0]}), "; db=$dbfiles\n";

I think there are () missing for the array on line 22, no? Haven't used Perl in a long time.

Error 2

FATAL: Unable to find /usr/local/bin/cmsearch executable

something is specifically calling a global installation of cmsearch, I thought that it was maybe hardcoded somewhere accidentally but the only script in TIGER where I can find it is introns.pl in the following lines:

    376  #my $cmd = "cmsearch --cpu 0 --tblout gpi.tbl $lib/cm/Intron_gpI.cm $file &> /dev/null";
    377  my $cmd = "cmsearch --cpu 0 --tblout gpi.tbl $lib/cm/gpi_bact_trna.cm $file &> /dev/null";
    378  RunCommand($cmd, 'gpi.tbl');

and I don't see why it would explicitly call /usr/local/bin/cmsearch because I have binaries exported as for everything else as well and running fine:

$ which cmsearch
/home/leon/tools/infernal-1.1.3-linux-intel-gcc/binaries/cmsearch

is introns.pl even called from islander.pl? Not sure if it's then some other piece of software doing it...

Error 3

I'm also getting some errors with Prokka (1.13) which I don't think are related to TIGER:

Argument "1.7.7" isn't numeric in numeric lt (<) at /home/beate/anaconda/bin/prokka line 253.
Use of uninitialized value in concatenation (.) or string at /home/beate/anaconda/bin/prokka line 201.
Use of uninitialized value in numeric lt (<) at /home/beate/anaconda/bin/prokka line 202.
[12:05:35] Prokka needs signalp 3.0 or higher. Please upgrade and try again.

it's probably some BioPerl conundrum... also it asks for singalp, but signalp is there:

$ signalp -version
SignalP version 5.0b Linux x86_64

I think I'll try upgrading Prokka to the newest version and hopefully some of these errors will be solved then. But please do comment on the 1st and 2nd error I mentioned here.

Thanks!

lxsteiner commented 4 years ago

Interesting development of events. After upgrading Prokka, the Prokka issues are gone and I'm getting a little bit more output from islander.pl but nowhere near the entire directory structure from your output example and some files still empty.

testdata_islander.txt

The issue with cmsearch still persists even though I have cmsearch in a conda environment and locally outside of it. FATAL: Unable to find /usr/local/bin/cmsearch executable

which is funny because there is even a global cmsearch installation although an older version:

$ /usr/bin/cmsearch -h
# cmsearch :: search CM(s) against a sequence database
# INFERNAL 1.1rc4 (June 2013)

but the correct versions are all exported to the path before the global one:

$ which cmsearch
/home/leon/tools/infernal/binaries/cmsearch

or in conda:

$ which cmsearch
/home/leon/miniconda3/envs/PROKKATIGER/bin/cmsearch

also this time the islander.log was 36 MB big because it includes an actual output from cmsearch and hmmsearch.

Here is the entire output if you want to check it out: islander.zip

I'm not sure what to try and change anymore :/