agemagician / ProtTrans

ProtTrans is providing state of the art pretrained language models for proteins. ProtTrans was trained on thousands of GPUs from Summit and hundreds of Google TPUs using Transformers Models.
Academic Free License v3.0
1.1k stars 152 forks source link

Protein-protein interaction #97

Closed Moeinh77 closed 1 year ago

Moeinh77 commented 1 year ago

How can I perform protein-protein interaction using the models? are there any examples in the repo that I couldn't find?

mheinzinger commented 1 year ago

Nope you did not miss it. We have no examples on this, yet, sorry. But there are some lines of work that might be useful as inspiration, e.g., D-Script or Topsy-Turvy. But I think when you want to approach this topic, the most important part is actually not the machine learning part but the dataset preparation part, so maybe take a look at this: More challenges for ML in PPI prediction

Moeinh77 commented 1 year ago

Thank you very much for sharing those links. My thoughts were to use the last hidden state of the networks as generated sequence features and based on the attained features for the two sequences (maybe combining them or doing another operation on 2 vectors), I do a binary classification for PPI prediction.

mheinzinger commented 1 year ago

Perfectly valid approach and def. very interesting! - If you get somewhere with this approach, it would be great if you could share your insights here :)