refresh-bio / colord

A versatile compressor of third generation sequencing reads.
GNU General Public License v3.0
49 stars 14 forks source link

Nonbinary output format? #1

Open lskatz opened 3 years ago

lskatz commented 3 years ago

Hi, I don't know if this makes sense for the output format, but I was wondering if CoLoRd could output a plaintext format? I am so used to manually inspecting files as a sort of sanity check that having a bespoke binary format might get in the way of our workflow. Or also if the compressed binary file makes its way to another computer without CoLoRd, it is possible that there will be no way to inspect the file.

I understand that it might not be possible with however it's structured and also that it will not be as compressed just like sam vs bam.

So anyway, is there a way to see what's under the hood? A way to pass around plaintext?

agudys commented 3 years ago

Hello,

Not sure if I understand your needs, but unfortunately, CoLoRd archives are pure binary files in our internal format. This is different than BAM which contains text header followed by gzip-decompressable blocks. It could be possible to add some modes for checking archive integrity, but this would require CoLoRd to be installed.

Regards, Adam