zqgao22 / HIGH-PPI

MIT License
68 stars 10 forks source link

F1 #10

Open Ystartff opened 10 months ago

Ystartff commented 10 months ago

Hello! Wondering why the experiment turned out differently every time you set up a random seed, failing to review your best metrics

horacehht commented 9 months ago

Hello! Wondering why the experiment turned out differently every time you set up a random seed, failing to review your best metrics

Me too. I run the PIPR and HIGH-PPI, both 9 runs. The average performance of PIPR is better than HIGH-PPI.

zqgao22 commented 9 months ago

Thank you for your interest. We will check the code and reply to you soon. However, please note that we did not use the complete SHS27k dataset for training and testing. This is because for some of proteins, we could not find their native structures in current PDB database. To ensure fairness, other baseline models were also trained and tested on the incomplete SHS27k dataset.

horacehht commented 9 months ago

Thank you for your interest. We will check the code and reply to you soon. However, please note that we did not use the complete SHS27k dataset for training and testing. This is because for some of proteins, we could not find their native structures in current PDB database. To ensure fairness, other baseline models were also trained and tested on the incomplete SHS27k dataset.

I have checked the dataset I used. In the whole SHS27K dataset, protein num is 1690. I checked the GNN_PPI/data/protein.SHS27k.sequences.dictionary.tsv. In your processed dataset, the protein num is 1553, which is surely different from the whole. I’m sure that I did use your tsv file and txt file. I use your dataset to run the PIPR and HIGH-PPI, the average performance of HIGH-PPI is worse. Could you please provide your training parameters corresponding to the best performance.

zqgao22 commented 9 months ago

Thank you for your interest. We will check the code and reply to you soon. However, please note that we did not use the complete SHS27k dataset for training and testing. This is because for some of proteins, we could not find their native structures in current PDB database. To ensure fairness, other baseline models were also trained and tested on the incomplete SHS27k dataset.

I have checked the dataset I used. In the whole SHS27K dataset, protein num is 1690. I checked the GNN_PPI/data/protein.SHS27k.sequences.dictionary.tsv. In your processed dataset, the protein num is 1553, which is surely different from the whole. I’m sure that I did use your tsv file and txt file. I use your dataset to run the PIPR and HIGH-PPI, the average performance of HIGH-PPI is worse. Could you please provide your training parameters corresponding to the best performance.

Sure, Haitao. We will provide available checkpoints and more detailed hyperparameters immediately.