Closed Heisenburger2020 closed 2 months ago
Hello!
Yes, our dataset is different from their original paper. We have mentioned that in our paper:
This pre-processing resulted in different datasets in all tasks such as EC and GO. Therefore the results reported in our paper should not be directly compared to other papers.
Hi!
After careful examination, we found out that there was a silght difference for EC and GO evaluation.
Specifically, we copied the evaluation function from GearNet.
This function requires the input shape to be (B, N), where B is the number of proteins and N is the number of labels. However, our predictions and targets were flatten before evaluation, which means their shape were (1, B*N). This wouldn't cause an error but would lead to the reported results to be lower. Intuitively, this evaluation is like on a global level and previous evaluation is like an averaged result among proteins.
We have revised our evaluation code by reshaping the input tensors, as shown below:
Kindly note that our key conclusion will not change, namely that Saprot still remains a SOTA model under the new setting (will update new results soon).
Thank you again for pointing out such problem! :)
Hi! New results have been updated!
Dear Sir,
Why is the EC GO result in Saprot so much lower than the original paper "Enhancing Protein Language Model with Structure-based Encoder and Pre-training"? I wonder if the dataset is different.