rs239 / ablm

Protein language model customized for antibodies
MIT License
60 stars 8 forks source link

AbMAP: Antibody Mutagenesis-Augmented Processing

image

This repository is a work in progress.

This repository contains code and pre-trained model checkpoints for AbMAP, a Protein Language Model (PLM) customized for antibodies as featured in Learning the Language of Antibody Hypervariability (Singh, Im et al. 2023). AbMAP leverages information from foundational PLMs as well as antibody structure and function, offering a multi-functional tool useful for predicting structure, functional properties, and analyzing B-cell repertoires.

Installation

AbMAP relies on ANARCI to assign IMGT labels to antibody sequences. Please see the ANARCI repo or run the following in a new conda environment:

conda install -c conda-forge biopython -y
conda install -c bioconda hmmer=3.3.2 -y
git clone https://github.com/oxpig/ANARCI.git
cd ANARCI
python setup.py install

Then install abmap using:

pip install abmap  # (recommended) latest release from PyPI 
pip install git+https://github.com/rs239/ablm.git  # the live main branch

Usage:

After installation, AbMAP can be easily imported into your python projects or run from the command line. Please see examples/demo.ipynb for common use cases.

Reference

Please provide feedback on the issues page or by opening a pull request. If AbMAP is useful in your work, please consider citing our bioRxiv preprint.