This repository holds the code for our paper which performs comprehensive benchmarking of SNP and indel variant calling accuracy across 14 diverse bacterial species using Oxford Nanopore Technologies (ONT) and Illumina sequencing.
You can find the results in that paper. Future updates after publication based on new tools, versions, experiments etc. will be reported and shown here.
Benchmarking reveals superiority of deep learning variant callers on bacterial nanopore sequence data Michael B. Hall, Ryan R. Wick, Louise M. Judd, An N. T. Nguyen, Eike J. Steinig, Ouli Xie, Mark R. Davies, Torsten Seemann, Timothy P. Stinear, Lachlan J. M. Coin bioRxiv 2024.03.15.585313; doi: 10.1101/2024.03.15.585313
@article{hall_benchmarking_2024,
title = {Benchmarking reveals superiority of deep learning variant callers on bacterial nanopore sequence data},
url = {https://www.biorxiv.org/content/early/2024/03/16/2024.03.15.585313},
doi = {10.1101/2024.03.15.585313},
journal = {bioRxiv},
author = {Hall, Michael B. and Wick, Ryan R. and Judd, Louise M. and Nguyen, An N. T. and Steinig, Eike J. and Xie, Ouli and Davies, Mark R. and Seemann, Torsten and Stinear, Timothy P. and Coin, Lachlan J. M.},
year = {2024},
pages = {2024.03.15.585313}
}
Accessions and DOIs for all data can be found in config/accessions.csv
.
The variant truthsets and associated data for making these is archived on Zenodo.
See the config docs for instructions on how to configure this pipeline for your data.
You will need the following packages to run the pipeline:
snakemake
pandas
apptainer
or singularity
conda
A script for submitting the master Snakemake job on a Slurm cluster can be found at scripts/submit_slurm.sh
.