Hello, I downloaded the data.tar.gz and training-data.tar.gz data you provided on GitHub. When I ran them, I found that the number of proteins in the dataset was inconsistent with that provided in the supplementary file D3 of the paper. For example, why are there 38533, 1901, and 2845 data in the training set, validation set, and test set of mf, respectively? However, the supplementary file D3 of the paper describes why there are 57072, 2964, and 4221 data in the training set, validation set, and test set of mf, respectively?
Hello, I downloaded the data.tar.gz and training-data.tar.gz data you provided on GitHub. When I ran them, I found that the number of proteins in the dataset was inconsistent with that provided in the supplementary file D3 of the paper. For example, why are there 38533, 1901, and 2845 data in the training set, validation set, and test set of mf, respectively? However, the supplementary file D3 of the paper describes why there are 57072, 2964, and 4221 data in the training set, validation set, and test set of mf, respectively?