discoproject / disco

a Map/Reduce framework for distributed computing
http://discoproject.org
BSD 3-Clause "New" or "Revised" License
1.63k stars 241 forks source link

Running jobs on already running nodes #329

Open pavlobaron opened 12 years ago

pavlobaron commented 12 years ago

hi,

while working on komuro (https://github.com/pavlobaron/komuro), I had to enable Disco to run on nodes that are already running since you don't want to restart your Riak nodes just for analytics or use a separate cluster with upfront replication of live data. The fork is here: https://github.com/pavlobaron/disco

I didn't test it yet if I have broken any of the standard behaviour. If you guys are interested in this extension to become part of the standard, I would love to help.

Pavlo

pavlobaron commented 12 years ago

I have also extended Disco to accept erl:// inputs - calling an erlang function and taking its results as raw:// . There are some considerations and limitations described in the example and the scheme schript

pmundkur commented 12 years ago

Very interesting and useful stuff, thanks! We'll try and integrate soon.

pavlobaron commented 11 years ago

I have also implemented a RabbitMQ scheme as well as sucking Riak local vnode values through it in my fork. But my fork is out of sync since I've touched a lot of spots in the code. Are you guys interested in merging this with the mainstream? Would love to help.

pmundkur commented 11 years ago

Yes, we are definitely interested! We are currently late in releasing 0.4.4, so it might be best to integrate for 0.5 once the release is out.

pmundkur commented 11 years ago

The road is clear for integration! Would it be possible to submit separate pull requests based off of latest Disco master or 0.4.4 tag. for each feature?

pavlobaron commented 11 years ago

yes, sure. Just let's do this step by step starting in January. I would start with the part allowing to run jobs on already running workers. They, the erl:// and queue:// part. Is this ok?

pmundkur commented 11 years ago

On 08:44 Sat 15 Dec, Pavlo Baron wrote:

yes, sure. Just let's do this step by step starting in January. I would start with the part allowing to run jobs on already running workers. They, the erl:// and queue:// part. Is this ok?

Sounds great, thanks.

pavlobaron commented 11 years ago

hi,

wanted to start working on that, but would need someone to explain the local_cluster mode so I don't interfere or just extend it. Any way to chat, probably on irc?

cheers pb

pmundkur commented 11 years ago

sure irc is fine, though mailing list might be best, so that discussion is archived.

pooya commented 10 years ago

@pavlobaron If you still have this code around, I will be more than happy to help. I tried to find in you disco fork but could not find it. Thanks!

pavlobaron commented 10 years ago

quite an early stage of the work is in my disco playground: https://github.com/pavlobaron/disco_playground . My local copy is a mess where I experimented with working around replica data. I didn't go on with it due to other priorities, sorry.

pooya commented 10 years ago

@pavlobaron No worries. I'll check that out and try to see if it can be resurrected. Thank you.