This is the I/O library of the BioC++ project. It provides the io
module
which offers easy-to-use interfaces for the following formats:
The primary goal of this library is to offer higher level abstractions than the C libraries typically used in this domain (e.g. htslib) while at the same time offering an excellent performance. It hopes to offer a modern, well-integrated design that covers most typical I/O use-cases Bioinformaticians encounter.
The library provides stand-alone implementations of the formats (and does not require htslib). Support for reading and writing SAM/BAM/CRAM will be available in separate library until full implementations are available here.
Please see the online documentation for more details.
Attention: this library is currently a work-in-progress, and interfaces are not yet stable.
Simple reading of a FastA-file that is transparently decompressed:
bio::io::seq::reader reader{"example.fasta.gz"};
for (auto & rec : reader)
{
fmt::print("ID: {}\n", rec.id);
fmt::print("Seq: {}\n", rec.seq);
}
Reading a variant file and writing a new one that only contains variants that "PASS":
bio::io::var::reader reader{"example.vcf.gz"};
bio::io::var::writer writer{"example.bcf"};
for (auto & rec : reader)
if (rec.filter.empty() || (rec.filter.size() == 1 && rec.filter[0] == "PASS"))
writer.push_back(rec);
The format is transparently converted from compressed VCF to BCF if files have the respective extensions / magic headers.
requirement | version | comment | |
---|---|---|---|
compiler | GCC | ≥ 11 | no other compiler is currently supported! |
required libs | BioC++ core | = 0.7 | |
optional libs | zlib | ≥ 1.2 | required for *.gz and *.bcf support |
bzip2 | ≥ 1.0 | required for *.bz2 file support |
~/devel
.g++ -O3 -DNDEBUG -Wall -Wextra -std=c++20 \
-I ~/devel/biocpp-core/include \
-I ~/devel/biocpp-io/include \
-DBIOCPP_IO_HAS_ZLIB=1 -DBIOCPP_IO_HAS_BZIP2=1 \
your_file.cpp
-I ~/devel/fmt/include -D FMT_HEADER_ONLY=1
.