Open AaranWang opened 2 months ago
Hi, Wang,
Sorry for the late reply, we have updated the model weight, example dataset, and zero-shot scripts for ProtLGN.
ckpt/ProtLGN.pt
data/example
script/mutant_predict.sh
And we recently developed two more advanced protein engineering tools named ProtSSN and ProSST for zero-shot prediction. We recommend you try the new models!
Best wishes, Yang Tan
Thank you. I'm trying to reproduce the results of ProtLGN. I encountered challenges in Step 2: build graph dataset when i ran the command "python data.py --build_cath --protein_dataset data/cath40_k10 --c_alpha_max_neighbors 10 --use_sasa --use_bfactor --use_dihedral --use_coordinate" in script/build_cath_dataset.sh. I have downloaded cath-dataset4.2.0 and put it in data/cath_k10/raw directory. So are the materials for build graph dataset are complete now? If not, can you please provide the data in data/cath_k10/raw directory? Thank you.
The error message:
$ python data.py --build_cath --protein_dataset data/cath40_k10 --c_alpha_max_neighbors 10 --use_sasa --use_bfactor --use_dihedral --use_coordinate
Processing...
0it [00:00, ?it/s]
Traceback (most recent call last):
File "/home/wangq/Programs/ProtLGN/data.py", line 102, in
I have updated the new data process script and it works well.
mkdir -p data/cath_k10/raw
cd data/cath_k10/raw
wget https://huggingface.co/datasets/tyang816/cath/blob/main/dompdb.tar
# or wget https://lianglab.sjtu.edu.cn/files/ProtSSN-2024/dompdb.tar
tar -xvf dompdb.tar
Thank you, i have successfully built the graph dataset, Now i encountered another new question, when i ran the script/run_pretrain.sh, it showed the error "FileNotFoundError: [Errno 2] No such file or directory: 'data/proteingym_valid/Proteingym_validk10'" Can you share a example dataset of proteingym? Thank you.
So mean explanation about how to construct and run ProtLGN?