gritzko / swarm

JavaScript replicated model (M of MVC) library
http://swarmdb.net/
MIT License
2.68k stars 97 forks source link

extension towards paralell map / reduce that maps eventually #42

Open ghost opened 9 years ago

ghost commented 9 years ago

this is a nice dataflow reactive system.

i was wondering ig this project intends to allow parallelization of a single datanode in th dag to be say 100 dag nodes of the same type. then each one receives a slice of the data on the bus and then converges.

its a logical next step i feel to use the power of crdt because you dont have to conform to map reduce i think anymore because the results can join over time .

bacon.js has a similar approach as swarm (not the same) but has formalized the map reduce without crdt. https://github.com/baconjs/bacon.js

the two joined would be potent !!

can you have a look and see if you see what i see. i would like to converge the two and put a formal dsl that is serialisable perhaps with a web ide on top

gritzko commented 9 years ago

Well, seems too theoretical for me. Can you illustrate that with some real-life workload scenario?

ghost commented 9 years ago

real life workload. when you need to increase the throughput of a sysytem you run many of them in parallel. this is what Spark & Storm do in slight different fundamental ways. so the real world workload is a systems requirement for faster results.

the way reactive framework operate is a DAG. a single node acts as a computational node. but to achieve the real world requirement we need to have 100 of them (for example). in order for each of the 100 to do the computation, the data stream is sliced into 100 separate chunks. This what cassandra and spark do using Dstreams. Do this help ?

gritzko commented 8 years ago

Regarding multi-server/load-distribution scenarios, the only thing currently planned is consistent hashing. Using CRDTs for large scale computing is definitely out of the project's scope at this stage. Eventually, Swarm may expand in that direction, but that is too different from what we are doing now. There are some immediate practical challenges, like distributed counters. Those seem to be resolvable by the current means.

gritzko commented 8 years ago

P.S. The next step in that direction is adding server-side event filters/listeners.

ghost commented 8 years ago

looking forward to playing with it and seeing where i can push it. this wll integrate with love field nicely btw