magicDGS / ReadTools

A Universal Toolkit for Handling Sequence Data from Different Sequencing Platforms
https://magicdgs.github.io/ReadTools/
MIT License
6 stars 3 forks source link

Support BFQ compression #72

Closed magicDGS closed 7 years ago

magicDGS commented 7 years ago

FASTQ files could be compressed in BFQ format (see fastqutils for more information). In picard, there is a BamToBfqWriter, that is not general but we can use it to understand the format better. We can implement a GATKReadWriter that will be more general.

In addition, we can generate a reader for this kind of input to allow another source of reads detected by the extension. I don't know if the FastqReader for htsjdk could read from this kind of reads, but it will be useful for some people Edited: this is not going to happen, because it seems that this input is just for MAQ or some tools that are pretty old

This is related with #52, because we can use the enum in FastqConstants.FastqExtensions to decide the extension (although this does not support other compression algorithms. Because MAQ is using this format, I think that it could be a good idea to implement it for the output format of the tool to convert to FATSQ all the raw reads stored in BAM (or other) formats.

magicDGS commented 7 years ago

This is not going to be done. BFQ is not used and if so users should use Picard instead. Maybe that tool is a candidate for #71 once GATK/Picard is integrated.