Closed YaoYinYing closed 1 year ago
I tried to run the script but fails for me on zsh
, i think it goes wrong when you're trying to do system()
inside awk
.
Maybe the script is salvagable if you just print all the commands to stdout and then pipe it into bash again.
Here's a possible patch, haven't tested too thoroughly, should be able to do git apply patch
, test and commit to update this PR: https://gist.github.com/tomsercu/46321dbca8ded930900cfbbf21483f56
Well, looks like it works on both my macbook(awk version 20200816
) and Ubuntu workstation(GNU Awk 5.0.1
), so I have no idea why it run into error. What is the error message looks like? Will this occur to different version of awk
?
Your patch looks great if one runs command like bash scripts/download_weights.sh /path/to/weights/esm | bash
, which should generate a series of download commands and run all of then in another bash.
The question is - I simply guess urls of *-regression.pt from those of weights file listed in README.md, yet some of the infered urls do not exist at all, resulting in aria2c
raising an error message of download failure. As what I have planed, this can be ingored in system()
of awk
scripts. In that case, set -e
will stop the entire script instead.
bash scripts/download_weights.sh ../../db/weights/esm/ | bash
/path/to/db/weights/esm/checkpoints /path/to/repo/esm
Download complete: esm2_t48_15B_UR50D.pt
Download complete: esm2_t48_15B_UR50D-contact-regression.pt
Download complete: esm2_t36_3B_UR50D.pt
Download complete: esm2_t36_3B_UR50D-contact-regression.pt
Download complete: esm2_t33_650M_UR50D.pt
Download complete: esm2_t33_650M_UR50D-contact-regression.pt
Download complete: esm2_t30_150M_UR50D.pt
Download complete: esm2_t30_150M_UR50D-contact-regression.pt
Download complete: esm2_t12_35M_UR50D.pt
Download complete: esm2_t12_35M_UR50D-contact-regression.pt
Download complete: esm2_t6_8M_UR50D.pt
Download complete: esm2_t6_8M_UR50D-contact-regression.pt
Download complete: esmfold_3B_v1.pt
Download not complete: esmfold_3B_v1-contact-regression.pt
11/22 17:09:32 [NOTICE] Downloading 1 item(s)
p11-kit: softhsm: module failed to initialize, skipping: Internal error
[#94d511 0B/0B CN:1 DL:0B]
11/22 17:09:35 [ERROR] CUID#7 - Download aborted. URI=https://dl.fbaipublicfiles.com/fair-esm/regression/esmfold_3B_v1-contact-regression.pt
Exception: [AbstractCommand.cc:351] errorCode=22 URI=https://dl.fbaipublicfiles.com/fair-esm/regression/esmfold_3B_v1-contact-regression.pt
-> [HttpSkipResponseCommand.cc:240] errorCode=22 The response status is not successful. status=403
11/22 17:09:35 [NOTICE] Download GID#94d511c5c8bff1da not complete:
Download Results:
gid |stat|avg speed |path/URI
======+====+===========+=======================================================
94d511|ERR | 0B/s|https://dl.fbaipublicfiles.com/fair-esm/regression/esmfold_3B_v1-contact-regression.pt
Status Legend:
(ERR):error occurred.
aria2 will resume download if the transfer is restarted.
If there are any errors, then see the log file. See '-l' option in help/man page for details.
For your patch I suggest to skip this error by the following:
# download regression
print "if [[ ! -f $(basename "url_regression") || -f $(basename "url_regression").aria2 ]];then echo Download not complete: $(basename "url_regression");aria2c -x 10 "url_regression" || echo Nevermind; else echo Download complete: $(basename "url_regression");fi"
Yes the regression weights for contact prediction are there for some models, you can skip those silently. Let me know once the script is updated on this PR and I'll merge!
@tomsercu Thanks! this script is now updated to skip download failures and tested on both zsh
and bash
.
Thanks for your contribution!
This is a re-PR of #329 by kind hints of @tomsercu .
README.md
. pt files will be located at a subdir calledcheckpoints
.--model_pth
option to esmfold inference script to read pretrained esm weights. This might be useful for all users in one HPC cluster or workstation.