ohnosequences / mg7

Configurable and scalable 16S metagenomics data analysis
https://goo.gl/y3rZFD
GNU Affero General Public License v3.0
3 stars 3 forks source link

Simplify the pipeline #32

Open laughedelic opened 8 years ago

laughedelic commented 8 years ago

Everybody knows that launching loquats is not so fast and managing the pipeline manually is not convenient at all. So I suggest some simplification to the pipeline.

So I suggest to merge these three steps in one loquat:

  1. merge blast results
  2. do assignment
  3. do counting

Again, this works as task per sample.

Same reasons to merge flash and split step. But this requires having split as a separate step for pipelines that don't need paired-end reads merging.

marina-manrique commented 8 years ago

Fine for me about doing in the same task: merge, count, assign

Same reasons to merge flash and split step. But this requires having split as a separate step for pipelines that don't need paired-end reads merging.

and this I would also prefer to have it in independent steps so we can work with single reads/scaffolds...

laughedelic commented 8 years ago

and this I would also prefer to have it in independent steps so we can work with single reads/scaffolds...

what I mean is having different pipelines for these cases:

For paired-end reads:

  1. flash + split-on-chunks
  2. blast
  3. merge-chunks + assign + count

And for non paired-end reads/scaffolds:

  1. split-on-chunks
  2. blast
  3. merge-chunks + assign + count

(the difference is only in the first loquat/step).

@marina-manrique @eparejatobes I'd like to get feedback from you, because if you support this suggestion, I want to include it in the next release:

marina-manrique commented 8 years ago

Perfect for me

eparejatobes commented 8 years ago

OK, :+1: