Hello @AstrobioMike , I found what seems to be an easy-to-solve issue with GToTree regarding the URLs for GTDB metadata. See below:
System environment
Mac OS, GToTree version 1.8.2
Problem description
When gtt-check-or-setup-GTDB-files is run in a fresh install of GToTree, the following HTTP 404 error occurs:
$ gtt-check-or-setup-GTDB-files
Downloading and parsing archaeal and bacterial metadata tables from
GTDB (only needs to be done once)...
Traceback (most recent call last):
File "/opt/homebrew/Caskroom/miniforge/base/envs/gtotree_1.8.2/bin/gtt-check-or-setup-GTDB-files.backup", line 161, in <module>
main()
File "/opt/homebrew/Caskroom/miniforge/base/envs/gtotree_1.8.2/bin/gtt-check-or-setup-GTDB-files.backup", line 31, in main
check_and_or_get_gtdb_files(os.environ["GTDB_dir"])
File "/opt/homebrew/Caskroom/miniforge/base/envs/gtotree_1.8.2/bin/gtt-check-or-setup-GTDB-files.backup", line 157, in check_and_or_get_gtdb_files
gen_gtdb_tab(GTDB_dir)
File "/opt/homebrew/Caskroom/miniforge/base/envs/gtotree_1.8.2/bin/gtt-check-or-setup-GTDB-files.backup", line 92, in gen_gtdb_tab
arc_tar_gz = urllib.request.urlopen("https://data.gtdb.ecogenomic.org/releases/latest/ar53_metadata.tar.gz")
File "/opt/homebrew/Caskroom/miniforge/base/envs/gtotree_1.8.2/lib/python3.9/urllib/request.py", line 214, in urlopen
return opener.open(url, data, timeout)
File "/opt/homebrew/Caskroom/miniforge/base/envs/gtotree_1.8.2/lib/python3.9/urllib/request.py", line 523, in open
response = meth(req, response)
File "/opt/homebrew/Caskroom/miniforge/base/envs/gtotree_1.8.2/lib/python3.9/urllib/request.py", line 632, in http_response
response = self.parent.error(
File "/opt/homebrew/Caskroom/miniforge/base/envs/gtotree_1.8.2/lib/python3.9/urllib/request.py", line 561, in error
return self._call_chain(*args)
File "/opt/homebrew/Caskroom/miniforge/base/envs/gtotree_1.8.2/lib/python3.9/urllib/request.py", line 494, in _call_chain
result = func(*args)
File "/opt/homebrew/Caskroom/miniforge/base/envs/gtotree_1.8.2/lib/python3.9/urllib/request.py", line 641, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found
Hello @AstrobioMike , I found what seems to be an easy-to-solve issue with GToTree regarding the URLs for GTDB metadata. See below:
System environment
Mac OS, GToTree version 1.8.2
Problem description
When
gtt-check-or-setup-GTDB-files
is run in a fresh install of GToTree, the following HTTP 404 error occurs:Proposed solution
The URLs in the gen_gtdb_tab function of gtt-check-or-setup-GTDB-files no longer seem to match the URLs to the metadata files in the latest release of the GTDB (r214).
Current URLs used in GToTree
Actual URLs in GTDB release 214
Changing the
tar
totsv
and then re-runninggtt-check-or-setup-GTDB-files
worked on my end.Revised code:
gtt-test.sh
finishes without errors after downloading the GTDB metadata via the revised URLs above.Final comments
Thanks for all your work on GToTree! It's an extremely helpful package!