TomFrost / Bristol

Insanely configurable logging for Node.js
MIT License
113 stars 19 forks source link

Cluster Awareness #3

Closed dmuth closed 7 years ago

dmuth commented 10 years ago

Let's say you have multiple processes logging to the same file:

var cluster = require("cluster");
var log = require("bristol");

var num_cpus = 2;

if (cluster.isMaster) {
    for (var i = 0; i < num_cpus; i++) {
        cluster.fork();
    }

} else {
    log.addTarget("file", { file: "log.txt" });
    log.addTarget("console")
        .withFormatter("human")
        ;
    log.info(process.pid, "Lorem Ipsum");

}

Then run the script, and while the script is running, fire up lsof:

$ lsof ./log.txt 
COMMAND   PID USER   FD   TYPE DEVICE SIZE/OFF     NODE NAME
node    49511 doug   13w   REG    1,2      788 66444792 ./log.txt
node    49512 doug   13w   REG    1,2      788 66444792 ./log.txt

Two processes, two different file descriptors, same file. Under UNIX, there is no guarantee that writes to a file are atomic, thus running the risk of partial lines being written to one file descriptor before the next line is written, thus creating a "staircase" effect.

Opening files in the master process then spawning the child process won't help. (I tried)

The best solution I found so far is to have a single process write files. This can be done with process.send() as follows:

//
// In the master process
//
process.on('message', function(m) {
   // Write to file
});

//
// In the child process, JSON data structure is sent to the master process
//
process.send({ foo: 'bar' });

There are probably other ways to accomplish the same. This particular technique worked for me in my projects.

TomFrost commented 10 years ago

Good workaround -- unfortunately, this is the case with pretty much all logging libs that keep an open WriteStream (and frankly, using one that utilizes one-off writes would have so much of a performance impact and draught in the thread pool that it wouldn't be worth using one that doesn't).

However, I'm keeping this incident open because I have a decent idea to get child processes' Bristol instances to autoconfigure according to the parent process, and automate the passing of messages. I'll comment again when that's in.

TomFrost commented 7 years ago

I'm going to close this for now. While there's a good solution to this, it would appear to be not worth the effort at this point. There's a strong consensus that for most use cases, if you're running Node on a machine with multiple CPUs, you should either:

With the above, you multiply your durability for literally free, and gain the advantage of a less complex codebase as a result.

I'll reopen if this becomes a popular request!