biocpp / biocpp-io

BioC++ Input/Output library
https://biocpp.github.io
BSD 3-Clause "New" or "Revised" License
8 stars 5 forks source link
bcf bcftools biocpp biopython c-plus-plus cplusplus-20 csv fasta fasta-format fastq fastq-format header-only input-output modern-cpp seqan sequence-analysis tsv vcf vcf-format

The BioC++ Input/Output Library

This is the I/O library of the BioC++ project. It provides the io module which offers easy-to-use interfaces for the following formats:

The primary goal of this library is to offer higher level abstractions than the C libraries typically used in this domain (e.g. htslib) while at the same time offering an excellent performance. It hopes to offer a modern, well-integrated design that covers most typical I/O use-cases Bioinformaticians encounter.

The library provides stand-alone implementations of the formats (and does not require htslib). Support for reading and writing SAM/BAM/CRAM will be available in separate library until full implementations are available here.

Please see the online documentation for more details.

Attention: this library is currently a work-in-progress, and interfaces are not yet stable.

Example

Simple reading of a FastA-file that is transparently decompressed:

bio::io::seq::reader reader{"example.fasta.gz"};

for (auto & rec : reader)
{
  fmt::print("ID:  {}\n", rec.id);
  fmt::print("Seq: {}\n", rec.seq);
}

Reading a variant file and writing a new one that only contains variants that "PASS":

bio::io::var::reader reader{"example.vcf.gz"};
bio::io::var::writer writer{"example.bcf"};

for (auto & rec : reader)
  if (rec.filter.empty() || (rec.filter.size() == 1 && rec.filter[0] == "PASS"))
    writer.push_back(rec);

The format is transparently converted from compressed VCF to BCF if files have the respective extensions / magic headers.

Easy to use

Dependencies

requirement version comment
compiler GCC ≥ 11 no other compiler is currently supported!
required libs BioC++ core = 0.7
optional libs zlib ≥ 1.2 required for *.gz and *.bcf support
bzip2 ≥ 1.0 required for *.bz2 file support

Quick-Setup

g++ -O3 -DNDEBUG -Wall -Wextra -std=c++20           \
    -I ~/devel/biocpp-core/include                  \
    -I ~/devel/biocpp-io/include                    \
    -DBIOCPP_IO_HAS_ZLIB=1 -DBIOCPP_IO_HAS_BZIP2=1  \
    your_file.cpp