churchlab / UniRep

UniRep model, usage, and examples.
338 stars 96 forks source link

Using model to predict #26

Open kvetab opened 2 years ago

kvetab commented 2 years ago

Hi, I'm sorry if this is a silly question, but I'm not too familiar with Tensorflow 1. After I train the model like in the last or second-to-last cell of the tutorial notebook, how do I use it to make predictions on other proteins? I'm using this for a binary classification on antibody sequences, but I can't figure out how to use the model after it's trained. Thanks a lot!

pykao commented 2 years ago

I think the main purpose of this paper is to learn the protein embedding in an unsupervised manner.

I suggest you to use babbler1900 as a embedding extractor that maps the antibody sequence into a 1900 feature vector. Then, train a classifier such as random forest, xgboost, or MLP on your classification task.

norakearns commented 2 years ago

Hi, I have a question about shuffling the sequences while maintaining the association between value and sequence - in the UniRep jupyter notebook tutorial you mention "You can get around this by just prepending every integer sequence with the sequence label (eg, every sequence would be saved to the file as "{brightness value}, 24, 1, 5,..." and then you could just index out the first column after calling the bucket_op." I am not sure how to implement this and was wondering if you might provide some guidance? Thank you!