News
The main branch works with PyTorch 1.8 (required by some self-supervised methods) or higher (we recommend PyTorch 1.12). You can still use PyTorch 1.6 for most methods.
OpenBioSeq
is an open-source supervised and self-supervised bio-sequence representation learning toolbox based on PyTorch. OpenBioSeq
supports popular backbones, pre-training methods, and various features.
Learning useful bio-sequence representation efficiently facilitates various downstream tasks in biological and chemical fields. This repo focuses on supervised and self-supervised bio-sequence representation learning and is named OpenBioSeq
.
This repo will be continued to update in 2022! Please watch us for latest update!
Please refer to CHANGELOG.md for details and release history.
[2022-06-09] OpenBioSeq
v0.1.1 is released.
[2022-05-24] OpenBioSeq
v0.1.0 is initialized.
There are quick installation steps for develepment:
conda create -n openbioseq python=3.8 -y
conda activate openbioseq
pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0 --extra-index-url https://download.pytorch.org/whl/cu113 # as an example
pip install openmim
mim install mmcv-full
git clone https://github.com/Westlake-AI/OpenBioSeq.git
cd OpenBioSeq
python setup.py develop
Please refer to INSTALL.md for detailed installation instructions and dataset preparation.
Please see Getting Started for the basic usage of OpenBioSeq (based on OpenMixup and MMSelfSup). As an example, you can start a multiple GPUs training with a certain CONFIG_FILE
using the following script:
bash tools/dist_train.sh ${CONFIG_FILE} ${GPUS} [optional arguments]
Then, please see tutorials for more tech details (based on MMClassification).
This project is released under the Apache 2.0 license.
OpenBioSeq
is an open-source project for supervised and self-supervised methods on bio-sequence datasets created by researchers in CAIRI AI LAB. We encourage researchers interested in bio-sequence research and applications to contribute to OpenBioSeq
!If you find this project useful in your research, please consider cite:
@misc{2022openbioseq,
title={{OpenBioSeq}: Open Toolbox and Benchmark for Bio-sequence Representation Learning},
author={Li, Siyuan and Liu, Zicheng and Wu, Di and Stan Z. Li},
howpublished = {\url{https://github.com/Westlake-AI/openbioseq}},
year={2022}
}
For now, the direct contributors include: Siyuan Li (@Lupin1998) and Zicheng Liu (@pone7). We thanks contributors for OpenMixup, MMSelfSup, and MMClassification.
This repo is currently maintained by Siyuan Li (lisiyuan@westlake.edu.cn) and Zicheng Liu (liuzicheng@westlake.edu.cn).