uw-ipd / RoseTTAFold2NA

RoseTTAFold2 protein/nucleic acid complex prediction
MIT License
322 stars 72 forks source link

where is update_blastdb.pl ? #75

Closed JiayouZhang closed 9 months ago

JiayouZhang commented 9 months ago

In readme.md:

# nt [151G]
update_blastdb.pl --decompress nt
cd ..

But I can't find update_blastdb.pl.

denizkavi commented 9 months ago

I was able to retrieve it from here: https://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/app/blast/update_blastdb.pl I'm not sure if this is how the authors intended it, but most scripts not found in the repo can be found through searching on Google.

Edit: This is a wrong way to approach this, you should be able to just use it from your shell (though not in a bash file) because it is apparently in your conda env.

JiayouZhang commented 9 months ago

It turns out that when you install the dependencies of the repo, update_blastdb.pl is downloaded in your conda environment:

which update_blastdb.pl 
/path-to-miniconda3/envs/env-name/bin/update_blastdb.pl
ShuZishan commented 7 months ago

Why is the nt dataset downloaded from this link https://ftp.ncbi.nlm.nih.gov/blast/db/FASTA/ larger [378GB] compared to the one downloaded using the command update_blastdb.pl --decompress nt [151GB]? Why are there differences between the two downloads? Could you provide details on the specific data that has been added or removed, and the reasons for these changes? I would greatly appreciate it.