Updated code to follow our revised JCIM paper, in particular away moving from UniProt-based splitting strategy as in our BioRxiv paper to sequence-based clustering approach whereby protein structures sharing more than 30% sequence identity are always allocated to the same testing/training set. We have also made data pre-processing more robust and fixed the version of several dependencies.
Updated code to follow our revised JCIM paper, in particular away moving from UniProt-based splitting strategy as in our BioRxiv paper to sequence-based clustering approach whereby protein structures sharing more than 30% sequence identity are always allocated to the same testing/training set. We have also made data pre-processing more robust and fixed the version of several dependencies.