bionode / bionode-watermill

💧Bionode-Watermill: A (Not Yet Streaming) Workflow Engine
https://bionode.gitbooks.io/bionode-watermill/content/
MIT License
37 stars 11 forks source link

Timeline #2

Closed thejmazz closed 7 years ago

thejmazz commented 8 years ago

A tentative plan for the way forward. I think once the core API is stable, time would be best spent reimplementing real world workflows that can be most improved with Streams. The primary concerns are

After that work can begin on DSL, Docker, admin panel app, nbind etc.

If you think there is not enough time partitioned to plans as their should be, or if some should be swapped, triaged for others, etc, don't hesitate to let me know. I'd like us all to agree upon realistic plans for these next 8 weeks that are exciting and fully satisfy the original overarching goal.

Week 5

Summary

For this week, continue improving waterwheel, examples with real world workflows. We already have a basic genomic one, I'd like to try out an RNASeq pipeline, or whatever you guys suggest. - stick to improving genomic workflow with sound reentrancy

Task orchestration core codebase largely resolved, parallel/forking/join becomes easy when task returns a regularly compliant stream (i.e. no more custom events). Forking not done yet, but because task is now a steam, can be done with existing modules - e.g. multi-write-stream.

Week 6

Pushed down:

Week summary:

Week Summary:

Week 8

Week summary:

Week 9

Week 10

Week 11

Week 12

Extras/Pushed out

bmpvieira commented 8 years ago

Good, but I think that:

need to be much higher (like week 6) since:

Otherwise, it might be very easy to get distracted implementing features that sound nice but might not have a clear and immediate benefit to people. Also, the web/electron GUI is an ambitious project in itself, and so we shouldn't dedicate time to it until we have a solid and useful CLI solution, which probably won't happen before the end of GSoC since we only have 8 weeks left.

I believe @yannickwurm shares a similar view on this.

thejmazz commented 8 years ago

Edited the timeline. If you guys can pick out a few papers that have good methods that we can reproduce with waterwheel, that would be great. Having a workflow goal to implement really helps determine a goal and discover edge/use cases. I'm thinking something that takes a stream of SRA accessions (and can then distribute those emitted values over SGE/cluster). I'd like to play with RNAseq a bit too, maybe use Kallisto.

bmpvieira commented 7 years ago

Some of this can be recycled for next GSoC, but I'm closing it since the 2016 one is over. Here's a summary of what was achieved and what's next: https://github.com/bionode/gsoc16/blob/master/README.md