clicumu / doepipeline

A python package for optimizing processing pipelines using statistical design of experiments (DoE).
MIT License
23 stars 2 forks source link
bioinformatics doe optimization pipeline

doepipeline

Optimize your data processing pipelines with doepipeline. The optimization strategy implemented in doepipeline is based on methods from statistical Design of Experiments (DoE). Use it to optimize quantitative and/or qualitative factors of simple (single tool) or complex (multiple tool) pipelines.

doepipeline overview

Features

Quick start links

Take a look at the wiki documentation to getting started using doepipeline. Briefly, the following steps are needed to start using doepipeline.

  1. Install doepipeline
  2. Create YAML configuration file
  3. Run optimization

Four example cases (including data and configuration files) are provided to as help getting started: 1) de-novo genome assembly 2) scaffolding of a fragmented genome assembly 3) k-mer taxonomic classification of ONT MinION reads 4) genetic variant calling

Cite

doepipeline: a systematic approach for optimizing multi-level and multi-step data processing workflows Svensson D, Sjögren R, Sundell D, Sjödin A, Trygg J BioRxiv doi: https://doi.org/10.1101/504050

About this software

doepipeline is implemented as a Python package. It is open source software made available under the MIT license.

If you experience any difficulties with this software, or you have suggestions, or want to contribute directly, you have the following options: