weisberglab / beav

BEAV: Bacterial Element Annotation reVamped
GNU General Public License v3.0
59 stars 4 forks source link

AntiSMASH & GapMind installations give an error? #5

Open Anto007 opened 3 months ago

Anto007 commented 3 months ago

Hi,

Beav looks interesting and I was hoping to testing it out. I installed beav via conda on my Ubuntu 20.04.6 LTS workstation but I get the below error after activating the conda beav env and running beav_db

ANTISMASH
(This may take a while, estimated 9GB download)
Traceback (most recent call last):
  File "/data/conda_beav_env/bin/download-antismash-databases", line 6, in <module>
    from antismash.download_databases import _main
  File "/data/conda_beav_env/lib/python3.9/site-packages/antismash/__init__.py", line 12, in <module>
    from antismash.main import run_antismash, get_detection_modules, \
  File "/data/conda_beav_env/lib/python3.9/site-packages/antismash/main.py", line 40, in <module>
    from antismash.outputs import html, svg
  File "/data/conda_beav_env/lib/python3.9/site-packages/antismash/outputs/html/__init__.py", line 25, in <module>
    from antismash.outputs.html.generator import generate_webpage, find_local_antismash_js_path
  File "/data/conda_beav_env/lib/python3.9/site-packages/antismash/outputs/html/generator.py", line 45, in <module>
    VISUALISERS = _get_visualisers()
  File "/data/conda_beav_env/lib/python3.9/site-packages/antismash/outputs/html/generator.py", line 39, in _get_visualisers
    module = importlib.import_module(f"antismash.outputs.html.visualisers.{module_data.name}")
  File "/data/conda_beav_env/lib/python3.9/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "/data/conda_beav_env/lib/python3.9/site-packages/antismash/outputs/html/visualisers/bubble_view.py", line 39, in <module>
    from antismash.modules import nrps_pks
  File "/data/conda_beav_env/lib/python3.9/site-packages/antismash/modules/nrps_pks/__init__.py", line 15, in <module>
    from .html_output import generate_html, will_handle
  File "/data/conda_beav_env/lib/python3.9/site-packages/antismash/modules/nrps_pks/html_output.py", line 15, in <module>
    from .results import NRPS_PKS_Results, CandidateClusterPrediction, UNKNOWN
  File "/data/conda_beav_env/lib/python3.9/site-packages/antismash/modules/nrps_pks/results.py", line 18, in <module>
    from .nrpys import PredictorSVMResult
  File "/data/conda_beav_env/lib/python3.9/site-packages/antismash/modules/nrps_pks/nrpys.py", line 13, in <module>
    import nrpys
ModuleNotFoundError: No module named 'nrpys'

GAPMIND
Can't locate DBI.pm in @INC (you may need to install the DBI module) (@INC contains: /home/antonycp/tools/miniconda3/pkgs/perl-bioperl-core-1.007002-pl5262hdfd78af_3/lib/site_perl/5.26.2 /etc/perl /usr/local/lib/x86_64-linux-gnu/perl/5.30.0 /usr/local/share/perl/5.30.0 /usr/lib/x86_64-linux-gnu/perl5/5.30 /usr/share/perl5 /usr/lib/x86_64-linux-gnu/perl/5.30 /usr/share/perl/5.30 /usr/local/lib/site_perl /usr/lib/x86_64-linux-gnu/perl-base) at /data/conda_beav_env/share/beav-1.3.0/software/PaperBLAST/bin/extractHmms.pl line 3.
BEGIN failed--compilation aborted at /data/conda_beav_env/share/beav-1.3.0/software/PaperBLAST/bin/extractHmms.pl line 3.
Can't locate DBI.pm in @INC (you may need to install the DBI module) (@INC contains: /home/antonycp/tools/miniconda3/pkgs/perl-bioperl-core-1.007002-pl5262hdfd78af_3/lib/site_perl/5.26.2 /etc/perl /usr/local/lib/x86_64-linux-gnu/perl/5.30.0 /usr/local/share/perl/5.30.0 /usr/lib/x86_64-linux-gnu/perl5/5.30 /usr/share/perl5 /usr/lib/x86_64-linux-gnu/perl/5.30 /usr/share/perl/5.30 /usr/local/lib/site_perl /usr/lib/x86_64-linux-gnu/perl-base) at /data/conda_beav_env/share/beav-1.3.0/software/PaperBLAST/bin/extractHmms.pl line 3.
BEGIN failed--compilation aborted at /data/conda_beav_env/share/beav-1.3.0/software/PaperBLAST/bin/extractHmms.pl line 3.
diamond v2.1.8.162 (C) Max Planck Society for the Advancement of Science, Benjamin Buchfink, University of Tuebingen
Documentation, support and updates available at http://www.diamondsearch.org
Please cite: http://dx.doi.org/10.1038/s41592-021-01101-x Nature Methods (2021)

#CPU threads: 104
Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1)
Database input file: /data/conda_beav_env/share/beav-1.3.0/software/PaperBLAST/tmp/path.aa/curated.faa
Opening the database file... Error: Error detecting input file format. First line must begin with '>' (FASTA) or '@' (FASTQ).
diamond v2.1.8.162 (C) Max Planck Society for the Advancement of Science, Benjamin Buchfink, University of Tuebingen
Documentation, support and updates available at http://www.diamondsearch.org
Please cite: http://dx.doi.org/10.1038/s41592-021-01101-x Nature Methods (2021)

#CPU threads: 104
Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1)
Database input file: /data/conda_beav_env/share/beav-1.3.0/software/PaperBLAST/tmp/path.carbon/curated.faa
Opening the database file... Error: Error detecting input file format. First line must begin with '>' (FASTA) or '@' (FASTQ).

DONE
acarafat commented 3 months ago

Hello @Anto007 ,

Thanks for reporting the issue. It seems like nrpys module is not found. What happens if you activate the beav conda environment, install nrpys manually, and then run beav_db?

Please give it a try and let us know if the issue is still there.

Anto007 commented 3 months ago

Thanks @acarafat for getting back on this. I had tried this too but the same error persists after running beav_db After activating the beav conda environment, I ran the below

python -m pip install nrpys
Requirement already satisfied: nrpys in /data/conda_beav_env/lib/python3.9/site-packages (0.1.1)
beav_db --skip_bakta_db

ANTISMASH
(This may take a while, estimated 9GB download)
Traceback (most recent call last):
  File "/data/conda_beav_env/bin/download-antismash-databases", line 6, in <module>
    from antismash.download_databases import _main
  File "/data/conda_beav_env/lib/python3.9/site-packages/antismash/__init__.py", line 12, in <module>
    from antismash.main import run_antismash, get_detection_modules, \
  File "/data/conda_beav_env/lib/python3.9/site-packages/antismash/main.py", line 40, in <module>
    from antismash.outputs import html, svg
  File "/data/conda_beav_env/lib/python3.9/site-packages/antismash/outputs/html/__init__.py", line 25, in <module>
    from antismash.outputs.html.generator import generate_webpage, find_local_antismash_js_path
  File "/data/conda_beav_env/lib/python3.9/site-packages/antismash/outputs/html/generator.py", line 45, in <module>
    VISUALISERS = _get_visualisers()
  File "/data/conda_beav_env/lib/python3.9/site-packages/antismash/outputs/html/generator.py", line 39, in _get_visualisers
    module = importlib.import_module(f"antismash.outputs.html.visualisers.{module_data.name}")
  File "/data/conda_beav_env/lib/python3.9/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "/data/conda_beav_env/lib/python3.9/site-packages/antismash/outputs/html/visualisers/bubble_view.py", line 39, in <module>
    from antismash.modules import nrps_pks
  File "/data/conda_beav_env/lib/python3.9/site-packages/antismash/modules/nrps_pks/__init__.py", line 15, in <module>
    from .html_output import generate_html, will_handle
  File "/data/conda_beav_env/lib/python3.9/site-packages/antismash/modules/nrps_pks/html_output.py", line 15, in <module>
    from .results import NRPS_PKS_Results, CandidateClusterPrediction, UNKNOWN
  File "/data/conda_beav_env/lib/python3.9/site-packages/antismash/modules/nrps_pks/results.py", line 18, in <module>
    from .nrpys import PredictorSVMResult
  File "/data/conda_beav_env/lib/python3.9/site-packages/antismash/modules/nrps_pks/nrpys.py", line 13, in <module>
    import nrpys
ModuleNotFoundError: No module named 'nrpys'
acarafat commented 3 months ago

Hi @Anto007,

It looks like similar to the with recent nrpys development issue as mentioned here: https://github.com/bioconda/bioconda-recipes/issues/44377#issuecomment-1825111250

A possible solution is to use Conda to build nrpys 0.1.1 for Python 3.9: https://github.com/bioconda/bioconda-recipes/issues/44377#issuecomment-1824735638

alexweisberg commented 3 months ago

Hi @Anto007,

Arafat is correct, it seems to be an issue with the latest nrpys conda build recipe.

An easier alternative may be to try again with pip in the beav environment, but this time using the force reinstall option:

python -m pip install --upgrade --force-reinstall nrpys

Anto007 commented 3 months ago

Thank you very much @alexweisberg and @acarafat for your responses. The force reinstall for nrpys worked to get AntiSMASH installed but GapMind installation still seems to fail. I have the perl-DBI module in my conda env (as running conda list perl-dbi in my conda beav env returnsperl-dbi;1.643;pl5321h166bdaf_0;conda-forge) and so the below error message is strange. Any ideas?

GAPMIND
Can't locate DBI.pm in @INC (you may need to install the DBI module) (@INC contains: /home/antonycp/tools/miniconda3/pkgs/perl-bioperl-core-1.007002-pl5262hdfd78af_3/lib/site_perl/5.26.2 /etc/perl /usr/local/lib/x86_64-linux-gnu/perl/5.30.0 /usr/local/share/perl/5.30.0 /usr/lib/x86_64-linux-gnu/perl5/5.30 /usr/share/perl5 /usr/lib/x86_64-linux-gnu/perl/5.30 /usr/share/perl/5.30 /usr/local/lib/site_perl /usr/lib/x86_64-linux-gnu/perl-base) at /data/conda_beav_env/share/beav-1.3.0/software/PaperBLAST/bin/extractHmms.pl line 3.
BEGIN failed--compilation aborted at /data/conda_beav_env/share/beav-1.3.0/software/PaperBLAST/bin/extractHmms.pl line 3.
Can't locate DBI.pm in @INC (you may need to install the DBI module) (@INC contains: /home/antonycp/tools/miniconda3/pkgs/perl-bioperl-core-1.007002-pl5262hdfd78af_3/lib/site_perl/5.26.2 /etc/perl /usr/local/lib/x86_64-linux-gnu/perl/5.30.0 /usr/local/share/perl/5.30.0 /usr/lib/x86_64-linux-gnu/perl5/5.30 /usr/share/perl5 /usr/lib/x86_64-linux-gnu/perl/5.30 /usr/share/perl/5.30 /usr/local/lib/site_perl /usr/lib/x86_64-linux-gnu/perl-base) at /data/conda_beav_env/share/beav-1.3.0/software/PaperBLAST/bin/extractHmms.pl line 3.
BEGIN failed--compilation aborted at /data/conda_beav_env/share/beav-1.3.0/software/PaperBLAST/bin/extractHmms.pl line 3.
alexweisberg commented 3 months ago

Hmm, could you paste the output of the following command both before and after activating your beav environment?

echo $PERL5LIB

Sometimes having other perl modules installed elsewhere can conflict with the conda versions. If you unset PERL5LIB before activating your beav environment, that may help.

Anto007 commented 3 months ago

Thank you for your feedback @alexweisberg I think this problem is due to perl library conflicts. $PERL5LIB points to a perl version outside of my conda beav. I tried setting export PERL5LIB to the perl version within the conda beav env but GapMind installation still fails. I don't wanna run unset PERL5LIB without knowing if it would have permanent consequences? There are a hundred other tools installed on my system and I wish to not break those for the purpose of merely making beav work. One of the main reasons I was interested in beav is cos' it also provided GapMind analysis but unfortunately, it seems like beav would just end up on my 'interesting but unusable tools' list. GapMind standalone version is another tool that I don't feel very motivated to install since a convenient web-server is already available for the same

alexweisberg commented 3 months ago

If you run unset PERL5LIB from the command line prompt, it will only take effect for that login session. Alternatively, if you put your conda activation and beav commands in a bash script, you can put the unset command at the top of that bash script and it will only take effect while that script is running. The only way to make the unset permanent is to put it in your .bashrc or .profile files.

Anto007 commented 3 months ago

Thank you very much again @alexweisberg for your patient responses. I tried unset PERL5LIB outside the conda environment and then activated my conda beav env. Nothing gets displayed when I ran echo $PERL5LIB both before and after activating the conda beav env. GapMind installation still failed displaying the same error message Can't locate DBI.pm in @INC

I uninstalled the conda beav env and retried the above steps in a fresh conda beav env and the GapMind intallation error persisted. Strangely, antiSMASH installation also failed this time (despite making it to work in the previous conda beav env by running your recommended command-line python -m pip install --upgrade --force-reinstall nrpys. I had run this in this fresh conda installation too). Please find below the complete error log. Thanks again

beav_db --skip_bakta_db
MACSYFINDER
Downloading TXSScan (1.1.3).
Extracting TXSScan (1.1.3).
Installing TXSScan (1.1.3) in /data/conda_beav_env/share/beav-1.3.0/models
Cleaning.
The models TXSScan (1.1.3) have been installed successfully.

DEFENSE FINDER
Downloading defense-finder-models (1.3.0).
Extracting defense-finder-models (1.3.0).
Installing defense-finder-models (1.3.0) in /data/conda_beav_env/share/beav-1.3.0/models
Cleaning.
The models defense-finder-models (1.3.0) have been installed successfully.
Downloading CasFinder (3.1.0).
Downloading CasFinder (3.1.0).
Extracting CasFinder (3.1.0).
Extracting CasFinder (3.1.0).
Installing CasFinder (3.1.0) in /data/conda_beav_env/share/beav-1.3.0/models
Installing CasFinder (3.1.0) in /data/conda_beav_env/share/beav-1.3.0/models
Cleaning.
Cleaning.
The models CasFinder (3.1.0) have been installed successfully.
The models CasFinder (3.1.0) have been installed successfully.

ANTISMASH
(This may take a while, estimated 9GB download)
Downloading antismash.js: 100.00% downloaded.
Downloading PFAM version 35.0
Traceback (most recent call last):
  File "/data/conda_beav_env/lib/python3.9/urllib/request.py", line 1346, in do_open
    h.request(req.get_method(), req.selector, req.data, headers,
  File "/data/conda_beav_env/lib/python3.9/http/client.py", line 1285, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/data/conda_beav_env/lib/python3.9/http/client.py", line 1331, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/data/conda_beav_env/lib/python3.9/http/client.py", line 1280, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/data/conda_beav_env/lib/python3.9/http/client.py", line 1040, in _send_output
    self.send(msg)
  File "/data/conda_beav_env/lib/python3.9/http/client.py", line 980, in send
    self.connect()
  File "/data/conda_beav_env/lib/python3.9/http/client.py", line 1454, in connect
    self.sock = self._context.wrap_socket(self.sock,
  File "/data/conda_beav_env/lib/python3.9/ssl.py", line 501, in wrap_socket
    return self.sslsocket_class._create(
  File "/data/conda_beav_env/lib/python3.9/ssl.py", line 1074, in _create
    self.do_handshake()
  File "/data/conda_beav_env/lib/python3.9/ssl.py", line 1343, in do_handshake
    self._sslobj.do_handshake()
ConnectionResetError: [Errno 104] Connection reset by peer

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/conda_beav_env/lib/python3.9/site-packages/antismash/download_databases.py", line 125, in download_file
    req = request.urlopen(url)  # pylint: disable=consider-using-with
  File "/data/conda_beav_env/lib/python3.9/urllib/request.py", line 214, in urlopen
    return opener.open(url, data, timeout)
  File "/data/conda_beav_env/lib/python3.9/urllib/request.py", line 517, in open
    response = self._open(req, data)
  File "/data/conda_beav_env/lib/python3.9/urllib/request.py", line 534, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
  File "/data/conda_beav_env/lib/python3.9/urllib/request.py", line 494, in _call_chain
    result = func(*args)
  File "/data/conda_beav_env/lib/python3.9/urllib/request.py", line 1389, in https_open
    return self.do_open(http.client.HTTPSConnection, req,
  File "/data/conda_beav_env/lib/python3.9/urllib/request.py", line 1349, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [Errno 104] Connection reset by peer>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/conda_beav_env/bin/download-antismash-databases", line 10, in <module>
    sys.exit(_main())
  File "/data/conda_beav_env/lib/python3.9/site-packages/antismash/download_databases.py", line 521, in _main
    if not download(args):
  File "/data/conda_beav_env/lib/python3.9/site-packages/antismash/download_databases.py", line 474, in download
    download_pfam(
  File "/data/conda_beav_env/lib/python3.9/site-packages/antismash/download_databases.py", line 254, in download_pfam
    download_if_not_present(url, archive_filename, archive_checksum)
  File "/data/conda_beav_env/lib/python3.9/site-packages/antismash/download_databases.py", line 226, in download_if_not_present
    download_file(url, filename)
  File "/data/conda_beav_env/lib/python3.9/site-packages/antismash/download_databases.py", line 127, in download_file
    raise DownloadError("ERROR: File not found on server.\nPlease check your internet connection.")
antismash.download_databases.DownloadError: ERROR: File not found on server.
Please check your internet connection.

TIGER2
requirement Pfam-A.hmm file does not exist
Downloading and extracting Pfam-A hmms
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:--  0:00:03 --:--:--     0
curl: (35) Recv failure: Connection reset by peer

gzip: /data/conda_beav_env/share/beav-1.3.0/software/TIGER/db/Pfam-A.hmm.gz: unexpected end of file

DBSCAN-SWA
Downloading and extracting DBSCAN-SWA database
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  687M  100  687M    0     0  31.5M      0  0:00:21  0:00:21 --:--:-- 33.2M
db/
db/database/
db/database/phage_protein_db.dmnd
db/database/phage_nucl/
db/database/phage_nucl/phage_nucl_db.nhr
db/database/phage_nucl/phage_nucl_db.nin
db/database/phage_nucl/phage_nucl_db.nsq
db/database/uniprot.dmnd
db/profiles/
db/profiles/phage_inf_dict.npy
db/profiles/uniprot_species.txt
db/profiles/phage_inf_dict.txt

GAPMIND
Can't locate DBI.pm in @INC (you may need to install the DBI module) (@INC contains: /etc/perl /usr/local/lib/x86_64-linux-gnu/perl/5.30.0 /usr/local/share/perl/5.30.0 /usr/lib/x86_64-linux-gnu/perl5/5.30 /usr/share/perl5 /usr/lib/x86_64-linux-gnu/perl/5.30 /usr/share/perl/5.30 /usr/local/lib/site_perl /usr/lib/x86_64-linux-gnu/perl-base) at /data/conda_beav_env/share/beav-1.3.0/software/PaperBLAST/bin/extractHmms.pl line 3.
BEGIN failed--compilation aborted at /data/conda_beav_env/share/beav-1.3.0/software/PaperBLAST/bin/extractHmms.pl line 3.
Can't locate DBI.pm in @INC (you may need to install the DBI module) (@INC contains: /etc/perl /usr/local/lib/x86_64-linux-gnu/perl/5.30.0 /usr/local/share/perl/5.30.0 /usr/lib/x86_64-linux-gnu/perl5/5.30 /usr/share/perl5 /usr/lib/x86_64-linux-gnu/perl/5.30 /usr/share/perl/5.30 /usr/local/lib/site_perl /usr/lib/x86_64-linux-gnu/perl-base) at /data/conda_beav_env/share/beav-1.3.0/software/PaperBLAST/bin/extractHmms.pl line 3.
BEGIN failed--compilation aborted at /data/conda_beav_env/share/beav-1.3.0/software/PaperBLAST/bin/extractHmms.pl line 3.
diamond v2.1.8.162 (C) Max Planck Society for the Advancement of Science, Benjamin Buchfink, University of Tuebingen
Documentation, support and updates available at http://www.diamondsearch.org
Please cite: http://dx.doi.org/10.1038/s41592-021-01101-x Nature Methods (2021)

#CPU threads: 104
Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1)
Database input file: /data/conda_beav_env/share/beav-1.3.0/software/PaperBLAST/tmp/path.aa/curated.faa
Opening the database file...  [0.004s]
Loading sequences...  [0.28s]
Masking sequences...  [0.122s]
Writing sequences...  [0.065s]
Hashing sequences...  [0.022s]
Loading sequences...  [0s]
Writing trailer...  [0.003s]
Closing the input file...  [0s]
Closing the database file...  [0s]

Database sequences  142820
  Database letters  70722450
     Database hash  f0d53721b1b39cfb7472640f7d3a5970
        Total time  0.499000s
diamond v2.1.8.162 (C) Max Planck Society for the Advancement of Science, Benjamin Buchfink, University of Tuebingen
Documentation, support and updates available at http://www.diamondsearch.org
Please cite: http://dx.doi.org/10.1038/s41592-021-01101-x Nature Methods (2021)

#CPU threads: 104
Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1)
Database input file: /data/conda_beav_env/share/beav-1.3.0/software/PaperBLAST/tmp/path.carbon/curated.faa
Opening the database file...  [0.004s]
Loading sequences...  [0.235s]
Masking sequences...  [0.114s]
Writing sequences...  [0.057s]
Hashing sequences...  [0.017s]
Loading sequences...  [0s]
Writing trailer...  [0.003s]
Closing the input file...  [0s]
Closing the database file...  [0s]

Database sequences  127026
  Database letters  63083469
     Database hash  68294e65bf649f3c9f8b61b8475ac728
        Total time  0.432000s

DONE

Dont forget to set the BAKTA_DB environment variable to point to 
alternatively, provide --db  as an argument to bakta
alexweisberg commented 3 months ago

That is odd. Did you have the conda environment activated when you ran beav_db? The antismash error looks like an internet connection issue. perhaps the server hosting the antismash db is temporarily down. You might want to try rerunning it later.

The GapMind error also is strange. The perl library path (@INC) doesn't seem to include any of the folders from your conda environment, which makes me think it is not activated. If it is somehow using your system perl instead, which is a different version from the one in the beav environment, then it won't be able to use the DBI library from the conda environment.

Anto007 commented 3 months ago

Thank you @alexweisberg. I agree it's strange. The conda environment is activated though. If it were not activated, there's no way the command beav_db --skip_bakta_db would have worked (as is clear from my posted log) and conda is the only route I have followed for installing your tool. Hmm...I guess antismash db download will work if I perhaps retry later but I'll definitely need to think more on how best to fix the perl library path issue.

alexweisberg commented 3 months ago

A few other things you could try would be to activate the conda environment, and then try running

which perl

and

perl -e "print qq(@INC)"

Both of these should point to the perl files and paths in your conda environment. If not, something is odd with the setup of perl or the activation of conda on your system.