ai4protein / Pro-Prime

MIT License
0 stars 0 forks source link

Pro-Prime

GitHub license

Updated on 2024.07.24

Introduction

This repository provides the official implementation of Prime (Protein language model for Intelligent Masked pretraining and Environment (temperature) prediction).

Key feature:

Links

Details

What is Pro-Prime?

Pro-Prime, a novel protein language model, has been developed for predicting the Optimal Growth Temperature (OGT) and enabling zero-shot prediction of protein thermostability and activity. This novel approach leverages temperature-guided language modeling.

Use of PRIME

Main Requirements
biopython==1.81 torch==2.0.1

Installation

pip install -r requirements.txt

ProteinGym Scores can be downloaded in

https://drive.google.com/file/d/1AEpK3TmgFNszZXJQWwRPkHUugrdHrTgk/view?usp=sharing

šŸš€ Run Notebooks

Supervised fine-tuning for mutant fitness learning

šŸ™‹ā€ā™€ļø Feedback and Contact

šŸ›”ļø License

This project is under the MIT license. See LICENSE for details.

šŸ™ Acknowledgement

A lot of code is modified from šŸ¤— transformers and esm.

šŸ“ Citation

If you find this repository useful, please consider citing this paper:

@misc{tan2023,
      title={Engineering Enhanced Stability and Activity in Proteins through a Novel Temperature-Guided Language Modeling.}, 
      author={Pan Tan and Mingchen Li and Liang Zhang and Zhiqiang Hu and Liang Hong},
      year={2023},
      eprint={2304.03780},
      archivePrefix={arXiv},
      primaryClass={q-bio.QM}
}