agemagician / ProtTrans

ProtTrans is providing state of the art pretrained language models for proteins. ProtTrans was trained on thousands of GPUs from Summit and hundreds of Google TPUs using Transformers Models.
Academic Free License v3.0
1.1k stars 152 forks source link

About the example:ProtT5-XL-UniRef50.ipynb #59

Closed Lyn-666 closed 2 years ago

Lyn-666 commented 2 years ago

Hi

I'm trying to extract protein sequences' features using the ProtT5-XL-UniRef50 model. But, when I repeated 'ProtT5-XL-UniRef50.ipynb', the features are different from the features in the 'ProtT5-XL-UniRef50.ipynb'. Then, I repeated ' ProtT5-XL-UniRef50.ipynb' by Tensorflow and PyTorch separately, the two features are the same, are different from the features in the examples. Can you help me find the reasons? Thanks !

agemagician commented 2 years ago

Hello,

We have not changed the model weights since we released it. However, the HuggingFace team sometimes fix bugs in their models' code, which means the output of these models might differ in newer HuggingFace releases. The same goes for other libraries that we used for our examples.

Please test the model with the downstream task, and if you found it provides worse results than ProtBert-BFD, please open a new issue.