Closed zw-SIMM closed 3 months ago
Sorry, I apologize for bothering you.
I found you have provided file files/cached_enzymemap.p
, which can totally reproduce your splitting.
TRAIN DATASET CREATED FOR ENZYMEMAP_REACTION_GRAPH.
* Number of samples: 34427
* Number of reactions: 12629
* Number of proteins: 9794
* Number of ECs: 2251
DEV DATASET CREATED FOR ENZYMEMAP_REACTION_GRAPH.
* Number of samples: 7287
* Number of reactions: 2669
* Number of proteins: 1964
* Number of ECs: 465
TEST DATASET CREATED FOR ENZYMEMAP_REACTION_GRAPH.
* Number of samples: 4642
* Number of reactions: 1554
* Number of proteins: 1407
* Number of ECs: 319
Thanks for your excellent work and contributions.
I noticed that when splitting the dataset by rule_id, the results I obtained differ slightly from the numbers mentioned in your paper. It may due to the random splitting?
My splitting:
The splitting in the paper:
To ensure that we can accurately reproduce your experimental results, would you mind providing the specific dataset splitting ids or the exact splitted files used in the paper?
Thank you very much for your assistance!