cubejs / cluster2

A node.js (>= 0.8.x) compatible multi-process management module
Other
48 stars 17 forks source link

replace specific domain socket with axon req/res pattern #40

Open inexplicable opened 10 years ago

inexplicable commented 10 years ago

https://npmjs.org/package/axon this is to replace the usage of domain socket with axon, which is tcp based, well, in theory, tcp connection will be slower than domain socket, but in localhost, the margin is limited, and it will be impossible for windows, overall, we could switch to axon library, where req/res, pub/sub are well supported. this should begin as an evaluation, performance wise, and see how much less code we could write. overall this should reduce the maintenance effort.

once evaluated, we'll need to update cache-mgr, cache-usr, and cache-common to use axon. api shouldn't be affected at all.

inexplicable commented 10 years ago

var axon = require('axon') , sock = axon.socket('pub');

sock.bind(3000); console.log('pub server started');

setInterval(function(){ sock.send('hello'); }, 500); SubSocket simply receives any messages from a PubSocket:

var axon = require('axon') , sock = axon.socket('sub');

sock.connect(3000);

sock.on('message', function(msg){ console.log(msg.toString()); });

inexplicable commented 10 years ago

var axon = require('axon') , sock = axon.socket('req');

sock.bind(3000);

sock.send(img, function(res){

}); RepSockets receive a reply callback that is used to respond to the request, you may have several of these nodes.

var axon = require('axon') , sock = axon.socket('rep');

sock.connect(3000);

sock.on('message', function(img, reply){ // resize the image reply(img); });

inexplicable commented 10 years ago

this is for cluster3 branch only

inexplicable commented 10 years ago

any updates?

zhuchenwang commented 10 years ago

Sorry, I was working on a crypto module this days for the release on Nov. 4th. I will start working on cluster soon.

zhuchenwang commented 10 years ago

To maintain the original logic, we need 2 pairs sockets on cache manager and cache user side. One is req/rep. rep Socket is on cache manager side to passivly get request from req socket, which is on cache user side and reply to the cache user. The other is pub/sub. pub socket is on cache manager side in order to actively notify the cache users.

Since it is an experiment, the code I've modified is quite ugly and contains a lot of hack. I only make sure it can pass the performance test.

In the situation of 10 workers with 200 times cache operation. Domain socket implementation takes about 370 - 415 ms to finish. The ratio of read and write does not impact the time much. TCP socket implementation takes 400 - 500 ms to finish. The execution time varies from each execution, but TCP socket implementation does takes more time than domain socket implementation.

inexplicable commented 10 years ago

that's fair, the gap is expected, and probably acceptable, considering the complexity we added to our cluster2 impl could be reduced using axon, plus windows os wouldn't be excluded from the supported runtime.

let's create a branch to implement axon based caching (manager/user) cutting from cluster3 branch. just to double check, whether pub/sub & req/res sockets must be different?

note, the current design has considered manager failover, in which case, a different domain socket will be selected and announced (via file modification/watch), each user will see the file change, and reconnect to the new cache manager, this must be preserved. thanks, let's roll.

zhuchenwang commented 10 years ago

The socket pairs in axon are kind of one way communication, i.e. only one side can actively send messages to the other. In req/rep, only req socket has the 'send' method, but rep socket can send back the reply by calling a callback function given in 'send' method. Therefore, cache-usr will use req socket to actively send request, and cache-mgr will use rep socket to get the messages and reply to cache-usr. However, cache-mgr cannot send messages to cache-usr through rep socket if it is not replyging anybody, like notification. In order to let the cache-mgr to push notifications to all cache-usr, each cache-usr should has a sub socket and cache-mgr should have a pub socket.

inexplicable commented 10 years ago

understood, plz start cutting branch and switch to axon, u might need to update the file based 'port/path' persistence/notification, as u will need 2 ports for req/res + pub/sub now.

the ports should be like a resource pool, which gets collected after all cache users stop listening to the port, and get enrolled in the queue again.