dib-lab / 2020-workflows-paper

Strategies for leveraging workflow systems to streamline large-scale biological analyses
https://dib-lab.github.io/2020-workflows-paper
Other
6 stars 8 forks source link

Streamlining Data-Intensive Biology With Workflow Systems

GitHub Actions Status

Accepted Manuscript

DOI

bioRxiv preprint (initially preprinted 07/01/2020)

PDF: PDF Manuscript

HTML: HTML Manuscript

Code of Conduct

This project operates under a code of conduct. Participating in the project in any way (issues, pull requests, gitter, or other media) indicates that you agree that you will follow the code of conduct. We take this very seriously. If you experience harassment or notice violations of the code of conduct, please raise the issue to one of the project organizers (@taylorreiter or @bluegenes).

Project Description

As the scale of biological data generation has increased, the bottleneck of research has shifted from data generation to analysis. Researchers commonly need to build computational workflows that include multiple analytic tools and require incremental development as experimental insights demand tool and parameter modifications. Data-centric workflow systems can alleviate some of these challenges, but knowledge of and training in these techniques is still lacking. Our goal is to generate a helpful set of strategies for leveraging workflow systems to streamline large-scale biological analyses.

Our initial version has been much improved through iterations of feedback primarily from members and friends of the DIB-lab. While the practices are written with specific examples for high-throughput sequencing data, we hope many of the perspectives and guidance provided by the document apply more generally to all workflow-enabled biology.

This repository is a living document (written with manubot) that aims to consolidate and integrate helpful information about workflow systems and their applications in data-intensive biology. We welcome constructive feedback from workflow-enabled biologists of all levels anywhere in the world.

Contributions

You'll need to make a free GitHub account.

Instructions and procedures for contributing are outlined here.

We will follow the ICMJE Guidelines for determining authorship.

Pull Requests

If you are not familiar with git and GitHub, you can use these directions to start contributing.

Please feel encouraged to ask questions by opening a Request for Help issue GitHub issues

This project is a collaborative effort that will benefit from the expertise of scientists across a wide range of workflow applications!

Manubot

Manubot is a system for writing scholarly manuscripts via GitHub. Manubot automates citations and references, versions manuscripts using git, and enables collaborative writing via GitHub. An overview manuscript presents the benefits of collaborative writing with Manubot and its unique features. The rootstock repository is a general purpose template for creating new Manubot instances, as detailed in SETUP.md. See USAGE.md for documentation how to write a manuscript.

Please open an issue for questions related to Manubot usage, bug reports, or general inquiries.

Repository directories & files

The directories are as follows:

License

License: CC BY 4.0 License: CC0 1.0

Except when noted otherwise, the entirety of this repository is licensed under a CC BY 4.0 License (LICENSE.md), which allows reuse with attribution. Please attribute by linking to https://github.com/dib-lab/2020-workflows-paper.

Since CC BY is not ideal for code and data, certain repository components are also released under the CC0 1.0 public domain dedication (LICENSE-CC0.md). All files matched by the following glob patterns are dual licensed under CC BY 4.0 and CC0 1.0:

All other files are only available under CC BY 4.0, including:

Please open an issue for any question related to licensing.

Attribution

Many of the documents (especially *.md documents) and issues presented in this repository were modified from another manubot repository.