a-r-j / graphein

Protein Graph Library
https://graphein.ai/
MIT License
1.03k stars 131 forks source link

Structure-informed dataset splitting #11

Closed a-r-j closed 1 year ago

a-r-j commented 4 years ago

Create good train/val/test sets based on SCOP/CATH classifications. Sequence-based approaches (e.g. identity thresholding or BLAST) are bad practice and should not be encouraged.