Inputs during training and inference

tyang816 / ProtSSN

Fusion of protein sequence and structural information, using denoising pre-training network for protein engineering (zero-shot).

MIT License

23 stars 2 forks source link

Hi, Wang,

Good question! From a biological perspective, the CATH domain already contains sufficient protein structure paradigms, but from a computational perspective, this is indeed a gap, and we will conduct additional experimental tests in the future.
We don't know how much error this will cause in dry experiments, because we can't get the crystal structure of most proteins. But we are currently doing wet experiment verification, and it seems that both the predicted structure and the crystal structure work well. We may be able to answer this question in our future iterative work.
I have added the new code for fine-tuning ProtSSN on any downstream tasks, you can see here. You could provide CSV with labels and PDB files for training, the dataset formation can be found here.

Thank you for your attention, and welcome to follow our latest work ProSST.😊

tyang816 / ProtSSN