Closed brainstorm closed 11 years ago
I tried adding external support via FIFO pipes, as noted in:
http://seqanswers.com/forums/archive/index.php/t-16540.html
In commit:
https://github.com/brainstorm/DRASS/commit/18b989e4c6142c1d4217c99298eba93a4eaf1574
But apparently simple_check cannot allocate memory when it's a UNIX FIFO:
Checking contamination against gz fifo file...
./simple_check -m 1 -q tests/data/ecoli_dummy.fastq.gz.fifo -r tests/data/ecoli.bloom
mmap source : Cannot allocate memory
make: *\ [tests] Error 1
U want to use FIFO to read the file without unzipping it??
Yes, but that was just a proof of concept. Gzip support should be preferably built in in DRASS, like other bioinfo programs do. And no, uncompressing the whole file first is not an option, reading/loading the file should be transparent. Den 29 aug 2012 10:12 skrev "tzcoolman" notifications@github.com:
U want to use FIFO to read the file without unzipping it??
— Reply to this email directly or view it on GitHubhttps://github.com/tzcoolman/DRASS/issues/11#issuecomment-8118169.
The reason to use FIFO is that people want to automatically unzip it and then process it. Is that correct?
Yes, that is correct, unzip it as a stream, here's an example:
http://windrealm.org/tutorials/decompress-gzip-stream.php http://stackoverflow.com/questions/1838699/how-can-i-decompress-a-gzip-stream-with-zlib
That one is pretty interesting too, it hijacks "open()" to transparently decompress the files.
I have used this (or equivalent) several years ago. Just tested it actually, but it is a nice solution. Won't work with memory mapping though, since (IWIR) it uses a subprocess for the decompression.
L
On Oct 8, 2012, at 15:58 , Roman Valls wrote:
That one is pretty interesting too, it hijacks "open()" to transparently decompress the files.
http://www.zlibc.linux.lu/download.html
— Reply to this email directly or view it on GitHub.
Thanks Enze for implementing this, I've just added support for it in the wrapper and works (although still segfaults due to the '@' issue (that character present in the qualities in the beggining of the line)).
Another thought I had is that we could just merge that code (big_query.c) with the regular simple_check_1_ge.c. But it works nice as it is now.
Thanks again!
Feature:
Our pipeline (and many others) dump FastQ compressed as gzip (http://en.wikipedia.org/wiki/Gzip). Add support to query against those on DRASS, transparently:
./simple_check -m 1 -q tests/data/ecoli_dummy.fastq.gz -r tests/data/ecoli.bloom