Improving accuracy while fine-tuning proteinBERT

Hi, You can look at the example notebook (used in finetuning on the benchmarks). https://github.com/nadavbra/protein_bert/blob/master/ProteinBERT%20demo.ipynb

Optimal accuracy (and higher complexity) is finetuning the entire model on a task, with a task specfic head (e.g. classification or regression), or freezing most layers and just fine-tuning/training the extra layer.

A much easier approach is to get embeddings, then train a seperate model on those embeddings, e.g. on the global layer embeddings + an XGboost or scikit-learn random forest model trained on that.

nadavbra / protein_bert

Improving accuracy while fine-tuning proteinBERT #72