OLC-Bioinformatics / ConFindr

Intra-species bacterial contamination detection
https://olc-bioinformatics.github.io/ConFindr/
MIT License
22 stars 8 forks source link

Running ConFindr with weird internet access #36

Closed lskatz closed 1 year ago

lskatz commented 2 years ago

Hi, I have some cgMLST databases and have formatted one of them with confindr but it still tries to download other databases when it initializes. I am behind a funky firewall and so I want to try to run this without internet access. How can I tell it to just look at one database and not try to download others? Or is this sort of the wrong way to think about things? Thank you for any help.

  2022-05-24 18:34:11  Welcome to ConFindr 0.7.4! Beginning analysis of your samples...
  2022-05-24 18:34:11  Could not find Escherichia_db_cgderived.fasta
  2022-05-24 18:34:11  Could not find Listeria_db_cgderived.fasta
  2022-05-24 18:34:11  Could not find Salmonella_db_cgderived.fasta
  2022-05-24 18:34:11  Could not find refseq.msh
  2022-05-24 18:34:11  Databases not present - downloading basic databases now...
  2022-05-24 18:34:11  Downloading mash refseq sketch...
  2022-05-24 18:34:11  Downloading cgMLST-derived data for Escherichia, Salmonella, and Listeria...
Traceback (most recent call last):
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/urllib/request.py", line 1350, in do_open
    encode_chunked=req.has_header('Transfer-encoding'))
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/http/client.py", line 1281, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/http/client.py", line 1327, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/http/client.py", line 1276, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/http/client.py", line 1036, in _send_output
    self.send(msg)
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/http/client.py", line 976, in send
    self.connect()
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/http/client.py", line 1451, in connect
    server_hostname=server_hostname)
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/ssl.py", line 423, in wrap_socket
    session=session
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/ssl.py", line 870, in _create
    self.do_handshake()
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/ssl.py", line 1139, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1091)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/bin/confindr.py", line 10, in <module>
    sys.exit(main())
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/site-packages/confindr_src/confindr.py", line 1214, in main
    confindr(args)
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/site-packages/confindr_src/confindr.py", line 1031, in confindr
    check_for_databases_and_download(database_location=args.databases)
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/site-packages/confindr_src/confindr.py", line 930, in check_for_databases_and_download
    download_cgmlst_derived_data(database_location)
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/site-packages/confindr_src/database_setup.py", line 235, in download_cgmlst_derived_data
    os.path.join(output_folder, 'confindr_db.tar.gz'))
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/urllib/request.py", line 247, in urlretrieve
    with contextlib.closing(urlopen(url, data)) as fp:
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/urllib/request.py", line 525, in open
    response = self._open(req, data)
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/urllib/request.py", line 543, in _open
    '_open', req)
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/urllib/request.py", line 503, in _call_chain
    result = func(*args)
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/urllib/request.py", line 1393, in https_open
    context=self._context, check_hostname=self._check_hostname)
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/urllib/request.py", line 1352, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1091)>
lskatz commented 2 years ago

Or also I tried with 7-gene MLST to just get started and it doesn't seem to work there either:

(confindr) [gzu2@monolith3 test_confindr]$ confindr_create_db -i senterica/ -o senterica.confindr -g Salmonella --desired_number_genes 7
# .... downloaded from NCBI okay ...
(confindr) [gzu2@monolith3 test_confindr]$ ls senterica
aroC.fasta      dnaN.fasta.nhr  hemD.fasta.nin  hisD.fasta.nsq  senterica.txt   thrA.fasta
aroC.fasta.nhr  dnaN.fasta.nin  hemD.fasta.nsq  purE.fasta      sucA.fasta      thrA.fasta.nhr
aroC.fasta.nin  dnaN.fasta.nsq  hisD.fasta      purE.fasta.nhr  sucA.fasta.nhr  thrA.fasta.nin
aroC.fasta.nsq  hemD.fasta      hisD.fasta.nhr  purE.fasta.nin  sucA.fasta.nin  thrA.fasta.nsq
dnaN.fasta      hemD.fasta.nhr  hisD.fasta.nin  purE.fasta.nsq  sucA.fasta.nsq
(confindr) [gzu2@monolith3 test_confindr]$ ls senterica.confindr -C | head
assembly_summary_refseq.txt  genome_33.fasta.nin   genome_670.fasta.nin
gene_hit_report.csv          genome_33.fasta.nsq   genome_670.fasta.nsq
genome_1000.fasta            genome_340.fasta      genome_671.fasta
genome_1000.fasta.nhr        genome_340.fasta.nhr  genome_671.fasta.nhr
genome_1000.fasta.nin        genome_340.fasta.nin  genome_671.fasta.nin
genome_1000.fasta.nsq        genome_340.fasta.nsq  genome_671.fasta.nsq
genome_1001.fasta            genome_341.fasta      genome_672.fasta
genome_1001.fasta.nhr        genome_341.fasta.nhr  genome_672.fasta.nhr
genome_1001.fasta.nin        genome_341.fasta.nin  genome_672.fasta.nin
genome_1001.fasta.nsq        genome_341.fasta.nsq  genome_672.fasta.nsq
(confindr) [gzu2@monolith3 test_confindr]$ confindr.py -i /scicomp/groups/OID/NCEZID/DFWED/EDLB/projects/validation/mlstComparison/illumina/weird-genomes.sneakernet -d senterica.confindr -o senterica.confindr.out --threads 12
  2022-05-24 18:40:10  Welcome to ConFindr 0.7.4! Beginning analysis of your samples...
  2022-05-24 18:40:10  Could not find Escherichia_db_cgderived.fasta
  2022-05-24 18:40:10  Could not find Listeria_db_cgderived.fasta
  2022-05-24 18:40:10  Could not find Salmonella_db_cgderived.fasta
  2022-05-24 18:40:10  Databases not present - downloading basic databases now...
  2022-05-24 18:40:10  Downloading mash refseq sketch...
  2022-05-24 18:40:11  Downloading cgMLST-derived data for Escherichia, Salmonella, and Listeria...
Traceback (most recent call last):
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/urllib/request.py", line 1350, in do_open
    encode_chunked=req.has_header('Transfer-encoding'))
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/http/client.py", line 1281, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/http/client.py", line 1327, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/http/client.py", line 1276, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/http/client.py", line 1036, in _send_output
    self.send(msg)
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/http/client.py", line 976, in send
    self.connect()
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/http/client.py", line 1451, in connect
    server_hostname=server_hostname)
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/ssl.py", line 423, in wrap_socket
    session=session
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/ssl.py", line 870, in _create
    self.do_handshake()
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/ssl.py", line 1139, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1091)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/bin/confindr.py", line 10, in <module>
    sys.exit(main())
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/site-packages/confindr_src/confindr.py", line 1214, in main
    confindr(args)
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/site-packages/confindr_src/confindr.py", line 1031, in confindr
    check_for_databases_and_download(database_location=args.databases)
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/site-packages/confindr_src/confindr.py", line 930, in check_for_databases_and_download
    download_cgmlst_derived_data(database_location)
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/site-packages/confindr_src/database_setup.py", line 235, in download_cgmlst_derived_data
    os.path.join(output_folder, 'confindr_db.tar.gz'))
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/urllib/request.py", line 247, in urlretrieve
    with contextlib.closing(urlopen(url, data)) as fp:
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/urllib/request.py", line 525, in open
    response = self._open(req, data)
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/urllib/request.py", line 543, in _open
    '_open', req)
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/urllib/request.py", line 503, in _call_chain
    result = func(*args)
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/urllib/request.py", line 1393, in https_open
    context=self._context, check_hostname=self._check_hostname)
  File "/scicomp/home-pure/gzu2/bin/anaconda3/envs/confindr/lib/python3.7/urllib/request.py", line 1352, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1091)>
pcrxn commented 1 year ago

Fixed in v0.8.1 (c4be1b4564f692f31261d21e751ba4db98785402) with -u/--unverified option for confindr_database_setup.