KwanLab / Autometa

Autometa: Automated Extraction of Genomes from Shotgun Metagenomes
https://autometa.readthedocs.io
Other
40 stars 15 forks source link

internet_is_connected() function #257

Closed paulineauffret closed 2 years ago

paulineauffret commented 2 years ago

Hello, I wanted to point out that the function internet_is_connected() (from common/utilities.py line 430) does not fit our HPC configuration (it always returns False because that kind of operations like ping are not permitted by our institute policy, although internet access is enable). I had to deactivate this function to make autometa-update-databases --update-ncbi work. What do you think about this situation, am I the only one encountering this bug ? Thank you very much. Pauline

Current Behavior

Steps to Reproduce

~> autometa-update-databases --update-ncbi
[03/31/2022 02:39:42 PM DEBUG] autometa.config.databases: MISSING: (ncbi,merged)
[03/31/2022 02:39:42 PM DEBUG] autometa.config.databases: MISSING: (ncbi,nr)
[03/31/2022 02:39:42 PM DEBUG] autometa.config.databases: MISSING: (ncbi,names)
[03/31/2022 02:39:42 PM DEBUG] autometa.config.databases: MISSING: (ncbi,accession2taxid)
[03/31/2022 02:39:42 PM DEBUG] autometa.config.databases: MISSING: (ncbi,nodes)
[03/31/2022 02:39:42 PM DEBUG] autometa.config.databases: UPDATE: (ncbi,nr): /autometa/lib/python3.9/site-packages/autometa/databases/ncbi/nr.gz
[03/31/2022 02:39:42 PM DEBUG] autometa.config.databases: starting nr download
[03/31/2022 04:03:31 PM DEBUG] autometa.common.utilities: Wrote /autometa/lib/python3.9/site-packages/autometa/databases/ncbi/nr.gz checksum to /autometa/lib/python3.9/site-packages/autometa/databases/ncbi/nr.gz.md5
Traceback (most recent call last):
  File "/autometa/bin/autometa-update-databases", line 10, in <module>
    sys.exit(main())
  File "/autometa/lib/python3.9/site-packages/autometa/config/databases.py", line 787, in main
    config = dbs.configure(section=section, no_checksum=args.no_checksum)
  File "/autometa/lib/python3.9/site-packages/autometa/config/databases.py", line 682, in configure
    self.download_missing(section=section)
  File "/autometa/lib/python3.9/site-packages/autometa/config/databases.py", line 550, in download_missing
    dispatcher[section](options)
  File "/autometa/lib/python3.9/site-packages/autometa/config/databases.py", line 396, in download_ncbi_files
    remote_checksum = self.get_remote_checksum("ncbi", option)
  File "/autometa/lib/python3.9/site-packages/autometa/config/databases.py", line 203, in get_remote_checksum
    raise ConnectionError("Cannot connect to the internet")
ConnectionError: Cannot connect to the internet
jason-c-kwan commented 2 years ago

Thank you for alerting us to this - our own HPC environment is not like this, so we didn't realize that some would disable ping. From some brainstorming @ajlail98 suggested confirming internet access by downloading a small file from NCBI. From your experience with your environment, would that work?

paulineauffret commented 2 years ago

Thank you very much for answering my issue. Yes that would be great and I think that would work, you would use wget ?

evanroyrees commented 2 years ago

Hello @paulineauffret, @ajlail98 has fixed this issue (PR #258) and we have committed these changes in the most recent release (https://github.com/KwanLab/Autometa/releases/tag/2.0.3).

Bioconda is usually a bit delayed when bumping versions, so conda install -c bioconda autometa will (at the moment) install version 2.0.2. You can check back to the bioconda autometa page to see when it is updated to version 2.0.3.

If you are using Autometa with the nextflow workflow, then this should not be an issue as the docker image for version 2.0.3 is already hosted and available to be pulled.

Please do not hesitate to let us know of any other issues you encounter! Thanks for reaching out!

paulineauffret commented 2 years ago

Hello, Thank you very much for addressing so quickly this issue !! I'll try this in the following days and let you know. Thanks again and best wishes !