ievapudz / TemStaPro

TemStaPro - a program for protein thermostability prediction using sequence representations from a protein language model.
MIT License
46 stars 9 forks source link

More accurate Tm value #4

Closed xing-he529 closed 9 months ago

xing-he529 commented 1 year ago

Hi, It's a very nice tool! Thanks for your job which helps me a lot. I'm working on protein evolution using very close species, however, the temperature range (40|45|50|55|60|65) can't tell me the difference between them. So I'm wondering can you provide me with a more accurate Tm value (e.g. mean value)?

Best wishes, Xing HE

ievapudz commented 1 year ago

Hello, thanks for a positive feedback. Unfortunately, the program's classifiers are trained on the data of organism growth temperatures - such data was abundant, therefore it could be used to efficiently train classifiers. There is a lot less data of protein Tm, therefore the classifiers could not be trained as well.

xing-he529 commented 1 month ago

hello, now I have some experimental data of protein Tm value, is it possible for me to train it?

ievapudz commented 4 weeks ago

Hello, if I understand correctly, you want to train a model to predict Tm value from protein sequence data? For that, you should train a regressor, whereas TemStaPro uses a collection of binary classifiers. You can try following this work (https://academic.oup.com/bioinformatics/article/40/4/btae157/7632735) as guidelines and adjust the model for the regression task or you can come up with a completely different design for the model.