Closed yangzhao1230 closed 9 months ago
For how to use MMSeqs2 to cluster at different identities, please refer to Supplementary Text 1. ML model development and evaluation. The cross-validation for each split is now updated. Let us know if there are further questions or the a need to upload the exact script for splitting.
Hi,
Thanks for your great work and nice code.
I'm interested in your data split, e.g. 'split10.csv' and 'split100.csv'. There are few details about how to get the splited data in both your paper and code. I guess you preprocess it through comparing data from SwissProt with data from your two test set.
I'd appreciate it if you could give more details about data split either in description or code.