AnjanaSenanayake / DeepSelectNet

DeepSelecNet is an enhanced deep learning model to perform read classification for selective sequencing.
MIT License
14 stars 0 forks source link
deep-neural-networks oxford-nanopore targeted-sequencing

DeepSelectNet

DeepSelectNet is an improved 1D ResNet based model to classify Oxford Nanopore raw electrical signals as target or non-target for Read-Until sequence enrichment or depletion. DeepSelectNet provides enhanced model performances.

Abstract

Nanopore sequencing allows selective sequencing, the ability to programmatically reject unwanted reads in a sample. Selective sequencing has many present and future applications in genomics research and the classification of species from a pool of species is an example. Existing methods for selective sequencing for species classification are still immature and the accuracy highly varies depending on the datasets. For the five datasets we tested, the accuracy of existing methods varied in the range of ∼77%-97% (average accuracy <89%). Here we present DeepSelectNet, an accurate deep-learning-based method that can directly classify nanopore current signals belonging to a particular species. DeepSelectNet utilizes novel data preprocessing techniques and improved neural network architecture for regularization.

Pre-print: https://www.biorxiv.org/content/10.1101/2022.10.24.513498v1

Installation

Prerequisities

Steps(On Linux)

1) Open a terminal in the root directory of the code repository. 2) Create a python3 virtual environment named deepselectenv

python3 -m venv deepselectenv

3) Use the following command to activate the virtual environment created.

source deepselectenv/bin/activate

4) Install required packages in the virtual enviroment.

pip install -r requirements.txt

5) [Optional] To Lave the environment when not in use.

deactivate

Scripts

1. Preprocessor

Preprocess the slow5 files into numpy dumps so that they can be used for training

2. Trainer

Train the model for given dataset using dumped numpy arrays

3. Inference

Predict the class of unseen slow5 reads with trained model

Support Scripts