A user-friendly workflow for phylogenomics
current version fails downloads with openssl / curl errors #74

Closed spleonard1 closed 1 year ago

spleonard1 commented 1 year ago

Hi! Trying to use the latest builds fails to download data with the following errors:

mamba create -y -n gtotree -c astrobiomik e -c conda-forge -c bioconda -c defaults gtotree

Looking for: ['gtotree']

conda-forge/osx-64 No change bioconda/osx-64 No change bioconda/noarch No change astrobiomike/osx-64 No change astrobiomike/noarch No change pkgs/main/osx-64 No change pkgs/main/noarch No change pkgs/r/osx-64 No change pkgs/r/noarch No change conda-forge/noarch @ 4.4MB/s 3.1s Transaction

Prefix: /Users/leonard28/miniconda3/envs/gtotree

Updating specs:


Downloading and Extracting Packages

Preparing transaction: done Verifying transaction: done Executing transaction: \ #################################################################################### kofamscan version 1.3.0-2 has been successfully installed!

This software needs a database which can be downloaded from

For more details see


To activate this environment, use

 $ mamba activate gtotree

To deactivate an active environment, use

 $ mamba deactivate

(base) houtx11:nancys_bugs leonard28$ conda activate gtotree (gtotree) houtx11:nancys_bugs leonard28$ gtt-hmms

               GToTree pre-packaged HMM SCG-sets

See for more info

The environment variable GToTree_HMM_dir is set to: /Users/leonard28/miniconda3/envs/gtotree/share/gtotree/hmm_sets/

The 15 available pre-packaged HMM SCG-sets include:

   Actinobacteria                    (138 genes)
   Alphaproteobacteria               (117 genes)
   Archaea                            (76 genes)
   Bacteria                           (74 genes)
   Bacteria_and_Archaea               (25 genes)
   Bacteroidetes                      (90 genes)
   Betaproteobacteria                (203 genes)
   Chlamydiae                        (286 genes)
   Cyanobacteria                     (251 genes)
   Epsilonproteobacteria             (260 genes)
   Firmicutes                        (119 genes)
   Gammaproteobacteria               (172 genes)
   Proteobacteria                    (119 genes)
   Tenericutes                        (99 genes)
   Universal_Hug_et_al                (16 genes)

Details can be found in: /Users/leonard28/miniconda3/envs/gtotree/share/gtotree/hmm_sets/hmm-sources-and-info.tsv

(gtotree) houtx11:nancys_bugs leonard28$

Downloading GToTree test data into the subdirectory GToTree-test-data/

Data being pulled from here:

% Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 curl: (35) OpenSSL/3.1.0: error:0A000152:SSL routines::unsafe legacy renegotiation disabled

Downloading the small test data failed for some reason :( You can try downloading it yourself from the link printed above and running the test as follows after unpacking it:

GToTree -a GToTree-test-data/ncbi_accessions.txt \ 
        -g GToTree-test-data/genbank_files.txt \ 
        -f GToTree-test-data/fasta_files.txt \ 
        -A GToTree-test-data/amino_acid_files.txt \ 
        -m GToTree-test-data/genome_to_id_map.tsv \ 
        -p GToTree-test-data/pfam_targets.txt \ 
        -H Universal -t -D -j 4 -o GToTree-test-output -F

Then you can compare the output to what is depicted here:

Exiting for now.

I can get a functional version of gtotree (GToTree v1.6.34) by forcing the install of python=3.7 in the environment as the alternate install option in the readme suggests. This will download things appropriately, but trying to upgrade to the latest version (1.7.08) gets me back to that original error.

Thanks for your help!

AstrobioMike commented 1 year ago

hmm, thanks for reporting this and for all the details, @spleonard1!

I'm not sure what's going on yet, i can recreate it on at least one ubuntu system i have access to (ubuntu 20.04), but not on a different one that is ubuntu 22.04, and i can't recreate it on either of my two macs running monterey 12.5.1 and ventura 13.2.1. And i can find some mention of a problem with later openssl versions and ubuntu (e.g., here and here). But i'm not finding any fix yet in my troubleshooting and testing things, whether using an earlier version of openssl or not. In fact, for me, even when i install with python 3.7 and using the older GToTree 1.6.34, things still aren't working for me...

This is a big problem that seems to be related to openssl communicating with the download target site, but i have no idea what to do about it yet.

In an active GToTree v1.6.34 environment where things worked for you, can you run conda list and paste the output for me here so i can see what the versions are of everything in there?

spleonard1 commented 1 year ago

Thanks for taking a look! I am on a Mac, Big Sur 11.7.4, which I unfortunately cannot upgrade because of IT restrictions

(gtotree) houtx11:nancys_bugs leonard28$ conda list

packages in environment at /Users/leonard28/miniconda3/envs/gtotree:


Name Version Build Channel

Name Version Build Channel

bc 1.07.1 h0d85af4_0 conda-forge biopython 1.79 py37h69ee0a8_2 conda-forge ca-certificates 2022.12.7 h033912b_0 conda-forge dos2unix 7.4.1 0 conda-forge entrez-direct 16.2 h193322a_1 bioconda fasttree 2.1.11 hdcdfbac_1 bioconda gdbm 1.18 h8a0c380_2 conda-forge gettext 0.21.1 h8a4c099_0 conda-forge gmp 6.2.1 h2e338ed_0 conda-forge gtotree 1.6.34 py37_0 astrobiomike gzip 1.12 h5eb16cf_0 conda-forge hmmer 3.3.2 h9722bc1_2 bioconda iqtree h135ad0d_1 bioconda kofamscan 1.3.0 hdfd78af_2 bioconda openssl 3.1.0 hfd90126_0 conda-forge python 3.7.12 hf3644f1_100_cpython conda-forge

AstrobioMike commented 1 year ago


and if you run this in that environment, it is actually able to download it successfully?

curl -L --retry 10 --fail -o GToTree-test-data.tar.gz

Or does that give you an error?

If it's successful, it should be about 7MB, e.g.:

ls -lh GToTree-test-data.tar.gz
# 7.0M Mar 31 17:42 GToTree-test-data.tar.gz
spleonard1 commented 1 year ago

"that environment" meaning the functional one? Yes it downloaded just fine (stepping away from this computer for the next 48 hrs so won't be able to try any more troubleshooting until next week :) )

curl -L --retry 10 --fail -o GToTree-test-data.tar.gz 60 % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 100 7163k 100 7163k 0 0 2236k 0 0:00:03 0:00:03 --:--:-- 4605k

AstrobioMike commented 1 year ago

Just notes about this right now:

Still not sure how to deal with this yet, there is some chatter in curl issues (e.g., and about implementing an option that would enable circumventing this, but so far the devs don't want to do that. I'm going to keep poking to try to find another way to deal with it, and keep a hopeful eye on those issues.

Since it's working on some systems and not others (some ubuntu are fine, some macs are fine), that seems to suggest maybe it's not just the openssl configuration, but ¯_(ツ)_/¯ Hopefully it's not an issue for too many specific setups

AstrobioMike commented 1 year ago

Since nothing had improved on this yet, trying some things out i found the certificate problem doesn't happen with zenodo links. So i created versions of the files GToTree used to download from figshare (test data and pre-packages hmms) on zenodo now, and changed the target links. This shouldn't be a problem anymore on any systems as of v1.7.10.

Please anyone re-open this if they still have an issue 👍