Open sj584 opened 1 year ago
Can you provide the files (4, 5, 6, 7) you mentioned above? i would like to reproduce the amazing result you've made in your paper, thanks!
Thanks for Reaching out! I initially didn't upload above files because they were too heavy.
I uploaded the files in this google drive link down below. https://drive.google.com/drive/folders/1erW8dht3YB6dAH6Z1YkbXHefgxfR_qjd?usp=share_link I'll fix the README so that everyone can access to the files easily
Thanks for you reply! I found it has only 45 example pdf files in your data, which is not suitable for the model training. I was wondering if you could provide the source python files you used for data processing ?
Thest files (generate_graphs.py, remove_symmetry.py, PSAIA_PSSM_2_pkl.py, generate_label.py) is missing in your git repo.
Thank you for pointing our my mistakes.
I uploaded the .py files you noted in the github repository. Also additionally uploaded the preprocessed epitope3D dataset in the google drive link.
After you import them using pickle, you'll get lists. Just add two lists like: total_train = train_pyg_surface_0.15 + test_pyg_surface_0.15
Model (.pt) trained on 200 PDB was stored in the checkpoint file.
This example is data preprocessing from Epitope3D external test set "epitope3d_dataset_45_Blind_Test.csv"
Tutorials
Prerequisits: csv file of PDB ID in Data_processing directory (w/wo epitope labels)
collect_pdb.py --> example_pdb/*.pdb
collect_fasta.py --> example_fasta/*.fasta
Make PSAIA data and PSSM data by...
PSAIA: Structure Analyser: Accessible Surface Area, Depth Index, Protrusion Index, Hydrophobicity Analyse as Bound, Table Output
PSI-BLAST. psi_blast -query example.fasta -db swissprot -num_iterations 3 -out_ascii_pssm example.pssm
After this process, you'll get
example_psaia/.tbl, example_pssm/.pssm
generate_graphs.py --> example_graphs_5A.pkl
remove_symmetry.py --> example_nonsym_graphs_5A.pkl
PSAIA_PSSM_2_pkl.py --> example_psaia.pkl, example_pssm.pkl
generate_label.py (not necessary when you don't have epitope labels) --> example_label_dict.pkl
generate_dataset.py --> example_pygsurface"RSA".pkl