AnantharamanLab / VIBRANT

Virus Identification By iteRative ANnoTation
GNU General Public License v3.0
142 stars 37 forks source link

KEGG/Pfam Pressing Error #89

Closed joshuakirsch closed 5 months ago

joshuakirsch commented 5 months ago

Hi,

I've updated the download links for the Pfam KOFAM database in VIBRANT_setup.py, but keep getting an error for these databases in the setup Here's my log file:

This script will download, extract subsets and press HMM profiles for VIBRANT.
This process will require ~20GB of temporary free storage space, but the final size requirement is ~11GB in the form of pressed HMM databases.
Please be patient. This only needs to be run once and will take a few minutes.

Verifying Pfam, KEGG and VOG source websites are available for download ...

Downloading HMM profiles for Pfam, KEGG and VOG from their source websites ...

Unzipping profiles ...

Concatenating individual profiles ...

Extracting profiles used for VIBRANT ...

Retrieved 19182 HMMs.

Retrieved 9980 HMMs.

Pressing profiles used for VIBRANT ...
Working...    Working...    Working...    done.
Pressed and indexed 19182 HMMs (19182 names).
Models pressed into binary file:   VOGDB94_phage.HMM.h3m
SSI index for binary model file:   VOGDB94_phage.HMM.h3i
Profiles (MSV part) pressed into:  VOGDB94_phage.HMM.h3f
Profiles (remainder) pressed into: VOGDB94_phage.HMM.h3p
done.
Pressed and indexed 20795 HMMs (20795 names and 20795 accessions).
Models pressed into binary file:   Pfam-A_v32.HMM.h3m
SSI index for binary model file:   Pfam-A_v32.HMM.h3i
Profiles (MSV part) pressed into:  Pfam-A_v32.HMM.h3f
Profiles (remainder) pressed into: Pfam-A_v32.HMM.h3p
done.
Pressed and indexed 9980 HMMs (9980 names).
Models pressed into binary file:   KEGG_profiles_prokaryotes.HMM.h3m
SSI index for binary model file:   KEGG_profiles_prokaryotes.HMM.h3i
Profiles (MSV part) pressed into:  KEGG_profiles_prokaryotes.HMM.h3f
Profiles (remainder) pressed into: KEGG_profiles_prokaryotes.HMM.h3p

Done with databases. Several new databases are now in this folder.

Verying correct dependency versions ...

VIBRANT Error: it looks like the KEGG HMM profiles were not downloaded/pressed correctly. Try re-running the VIBRANT_setup.py script.
VIBRANT Error: it looks like the Pfam HMM profiles were not downloaded/pressed correctly. Try re-running the VIBRANT_setup.py script.
joshuakirsch commented 5 months ago

Possibly, the size of the updated databases conflicts with the previous sizes in VIBRANT_setup.py? From my poor understanding of the code, VOG should have 19182 HMMs, Pfam should have 17929 HMMs, and KEGG should have 10033 HMMs.

joshuakirsch commented 5 months ago

Ok this was partially my error. In the future, I think that the links in line 72-73 in VIBRANT_setup.py do not match the links in lines 100-101. I updated the links and received a successful install message. I've attached an updated VIBRANT_setup.py in case anyone wants to use it (make sure to change the file ending back to .py from .txt) VIBRANT_setup.txt