dib-lab / khmer

In-memory nucleotide sequence k-mer counting, filtering, graph traversal and more
http://khmer.readthedocs.io/
Other
756 stars 295 forks source link

replacement C++ FAST[AQ] parser #643

Closed mr-c closed 9 years ago

mr-c commented 10 years ago

To address #641, #355, #77, and #249 I feel that it is time to look for a new C++ FAST[AQ] parser.

A trial of Seqan is ongoing in PR #642

This discussion was initially offline. I did the mockup to see if the performance was relatively okay and went from there.

Other libraries are either license incompatible (GATB is Affero GPL); not up to the task (Heng Li's code was raw + zlib only, no bzip2); or not oriented for use as a library (SeqDB as suggested by @luizirber)

See also: https://www.biostars.org/p/486/

Seqan already supports SAM/BAM, which will be useful for #523

ctb commented 10 years ago

Good stuff - thanks :).

ctb commented 10 years ago

(Well, maybe it doesn't need to be closed until the seqan implementation is merged. But I concur with the reasoning and decisions.)