ohnosequences / mg7

Configurable and scalable 16S metagenomics data analysis

https://goo.gl/y3rZFD

GNU Affero General Public License v3.0

3 stars 3 forks source link

Closed laughedelic closed 8 years ago

laughedelic commented 8 years ago

22 doesn't solve the issue:

We want to pass fasta files as input. If we don't have #21, then the pipeline is as follows:

split (which currently splits files on chunks of 2/4 rows, which doesn't work with Fasta)
blast (which takes each chunk, reads first 2 rows of it and makes out of it a fasta file for Blast)
merge
assign
count

laughedelic commented 8 years ago

this being used in split instead of .grouped(...) could help

eparejatobes commented 8 years ago

I just released fastarious 0.4.0, you can just use this https://github.com/ohnosequences/fastarious/blob/v0.4.0/src/test/scala/FastaTests.scala#L92-L105 and take chunkSize from that iterator. I would do it myself, but I'm a bit lost with the configuration

laughedelic commented 8 years ago

This is the last feature I add to M2, it's already way too big. Running tests in #25 and releasing.

laughedelic commented 8 years ago

LGTM I've merged #26 fixes in. Now I'm merging this and going to test it tomorrow.