Unitech / pm2

Node.js Production Process Manager with a built-in Load Balancer.
https://pm2.keymetrics.io/docs/usage/quick-start/
Other
41.4k stars 2.61k forks source link

Node 0.12 - PM2@latest and cluster_mode passing to all processes #1000

Closed askhogan closed 9 years ago

askhogan commented 9 years ago

I setup a simple express script. I then set it up similar to the existing examples and existing test cases

I setup a simple route that does nothing but log. I performed an http GET to this route and I received log writing on all 8 processes. Per the documentation it seemed that PM2 handles routing to only one cluster process at a time. I am not seeing that happen. All processes get the same GET request and perform the same call 8x times.

How do you setup a config to only process 1 time per API request?

API-7 (out): 2015-02-08 10:10 +00:00: [2015-02-08 10:10:28.585] [WARN] api - healthy API-6 (out): 2015-02-08 10:10 +00:00: [2015-02-08 10:10:28.585] [WARN] api - healthy API-5 (out): 2015-02-08 10:10 +00:00: [2015-02-08 10:10:28.585] [WARN] api - healthy API-4 (out): 2015-02-08 10:10 +00:00: [2015-02-08 10:10:28.585] [WARN] api - healthy API-3 (out): 2015-02-08 10:10 +00:00: [2015-02-08 10:10:28.585] [WARN] api - healthy API-2 (out): 2015-02-08 10:10 +00:00: [2015-02-08 10:10:28.585] [WARN] api - healthy API-1 (out): 2015-02-08 10:10 +00:00: [2015-02-08 10:10:28.585] [WARN] api - healthy API-0 (out): 2015-02-08 10:10 +00:00: [2015-02-08 10:10:28.585] [WARN] api - healthy

thomaschaaf commented 9 years ago

I am seeing this too, which sort of defeats the purpose.

askhogan commented 9 years ago

Totally. Seems like a super solid program. The clustering makes a lot of sense to me, because Elastic Load Balancer can only connect to one port. So it seems intuitive to have ELB -> PM2 Cluster Port -> PM2 Workers to gain 100% uptime.

What I cannot wrap my head around is how PM2 takes a port (e.g. 3000) which is bound to a system resource and defined in each file that then gets clustered to 8 separate processes (which would inherently cause a port conflict).

Is it somehow modifying express configuration on the fly to change the port. What I was expecting was there would be one main PM2 port that I could send my ELB connection to (https://api.callinize.com:9000). So my ELB would only point to that one isolated port. Instead what it seems is there is some magic behind the scenes to do this for you. It just makes me suspicious, because NAT, port or network card virtualization seems like an extremely complicated task.

From the docs it just says

With the cluster mode, PM2 enables load balancing between each worker. Each HTTP/TCP/UDP request will be forwarded to one specific process at a time.

But how does it do this? Where in the code? How does it connect to the port that is already inside of your typical API express setup. From the docs it seems API is the example case, so I assume inbound web proxying is the norm for this setup.

soyuka commented 9 years ago

Hi @askhogan, pm2 is using the cluster module (code), there is no port modification anywhere. Under the hood, it's simply forking an app and redirecting your request to the first worker available. I remind you that it's the clustering module that does this, not pm2 itself.

Here in pm2, God is the cluster master, the forking is done here. You might also want to take a look at ProcessContainer, this is where your process is living.

bwin commented 9 years ago

@askhogan it does work exactly as you expected. You can verify that with the following:

// app.js
var http = require('http');
http.createServer(function(req, res) {
    console.log("req", req.method, req.url);
    res.writeHead(200);
    res.end('hey from pm2 worker #' + process.env.pm_id + '\n' + new Date());
}).listen(8000);

Edit: modified the response in the example code.

Then start it with pm2 start --name testapp -i 4 app.js. Watch the logs with pm2 logs and make a request to localhost:8000. You will see that only one worker responds (everything else wouldn't make sense). As soyuka pointed out this is done with the cluster module from node-core. Only node > 0.10 is supported though. Have you checked your node version? If you're using a supported node version, (ok, it was already in the title) you should investigate why your app won't work with the cluster module, since an express app out of the box should work AFAIK.

Edit: The link to the cluster module is broken in the previous comment.

Edit: I was assuming an error in your code. I'm changing the wording from "you can verify..." to "could you see, if ... works as expected". It should definitely work as you expected it to work. That's for sure.

bwin commented 9 years ago

I can confirm that it does (at least for me) work with express (latest) out of the box.

var express = require('express');
var app = express();

app.get('/', function (req, res) {
  console.log("req", req.method, req.url);
  res.send('hey from pm2 worker #' + process.env.pm_id + '\n' + new Date());
});

var server = app.listen(3000, function () {
  var host = server.address().address;
  var port = server.address().port;
  console.log('Example app listening at http://%s:%s', host, port);
});

Edit: I'm also on node 0.12.0. pm2 0.12.5

askhogan commented 9 years ago

Thanks for posting express example. That was going to be my response to the first comment. I was using express 4 and node .12. I must have something in my setup that is causing this. I am going to check in the morning to see what is going on with my app or logger, log4js.

thomaschaaf commented 9 years ago

I found the problem! It is related to the --log flag. Setting it results pm2 logs outputting everything for every instance.

test.js:

var express = require('express');
var app = express();

app.get('/', function (req, res) {
  console.log("req", req.method, req.url);
  res.send('hey from pm2 worker #' + process.env.pm_id + '\n' + new Date());
});

var server = app.listen(3000, function () {
  var host = server.address().address;
  var port = server.address().port;
  console.log('Example app listening at http://%s:%s', host, port);
});
pm2 start -i 4 test.js
pm2 logs
HTTP response -> hey from pm2 worker #33 Sun Feb 15 2015 10:38:43 GMT+0000 (UTC)
Log:
test-33 (out): req GET /
pm2 start -i 4 -l test.out test.js
pm2 logs
HTTP response -> hey from pm2 worker #29 Sun Feb 15 2015 10:37:44 GMT+0000 (UTC)
Log: 
test-31 (entire): req GET /
test-29 (entire): req GET /
test-32 (entire): req GET /
test-30 (entire): req GET /
Tjatse commented 9 years ago

@thomaschaaf --log is designed to debug error context. of course the logs exist in every place (duplicated in separated logs). @askhogan 's example seems irrelevant to the entire log, it's out.

bwin commented 9 years ago

@thomaschaaf -l merges error and out logs into a single file, but you still have separate logfiles for each instance. (Check with pm2 info APPNAME.)

pm2 logs APPNAME (or just pm2 logs) displays the logs of all instances. To view the log of a single instance you would do pm2 logs ID.

From pm2 --help:

logs [options] [id|name] stream logs file. Default stream all logs

Edit: Somebody correct me if I'm lying.

askhogan commented 9 years ago

@bwin you indicate you are using pm2 0.12.5

But according to releases 0.12.4 is the latest?

https://github.com/Unitech/PM2/releases

I am assuming you are running the latest dev version in master. Just wanted to clarify for troubleshooting that the version you are running may not be what others are running, bc they are getting a stable release from npm.

bwin commented 9 years ago

I've got the latest stable from npm. That's 0.12.5. See https://www.npmjs.com/package/pm2

Edit: Sometimes somebody forgets to tag a git version. I think it's safe to always assume a npm version unless otherwise indicated.

askhogan commented 9 years ago

Ok weird. They must be out of sync on GH releases. I will install that version.

Thank you for your example yesterday showing the worker ID. This just helped me isolate. Here is my log output. It seems I am getting log output from the same worker. digging deeper. This may just be a logging issue or conflict with my logging app log4js

testapp-7 (out): 2015-02-15 20:31 +00:00: [2015-02-15 20:31:56.932] [WARN] app - hey from pm2 worker #3 testapp-7 (out): Sun Feb 15 2015 20:31:56 GMT+0000 (GMT) testapp-2 (out): 2015-02-15 20:31 +00:00: [2015-02-15 20:31:56.932] [WARN] app - hey from pm2 worker #3 testapp-2 (out): Sun Feb 15 2015 20:31:56 GMT+0000 (GMT) testapp-6 (out): 2015-02-15 20:31 +00:00: [2015-02-15 20:31:56.932] [WARN] app - hey from pm2 worker #3 testapp-6 (out): Sun Feb 15 2015 20:31:56 GMT+0000 (GMT) testapp-1 (out): 2015-02-15 20:31 +00:00: [2015-02-15 20:31:56.932] [WARN] app - hey from pm2 worker #3 testapp-1 (out): Sun Feb 15 2015 20:31:56 GMT+0000 (GMT) testapp-0 (out): 2015-02-15 20:31 +00:00: [2015-02-15 20:31:56.932] [WARN] app - hey from pm2 worker #3 testapp-0 (out): Sun Feb 15 2015 20:31:56 GMT+0000 (GMT) testapp-5 (out): 2015-02-15 20:31 +00:00: [2015-02-15 20:31:56.932] [WARN] app - hey from pm2 worker #3 testapp-5 (out): Sun Feb 15 2015 20:31:56 GMT+0000 (GMT) testapp-4 (out): 2015-02-15 20:31 +00:00: [2015-02-15 20:31:56.932] [WARN] app - hey from pm2 worker #3 testapp-4 (out): Sun Feb 15 2015 20:31:56 GMT+0000 (GMT) testapp-3 (out): 2015-02-15 20:31 +00:00: [2015-02-15 20:31:56.932] [WARN] app - hey from pm2 worker #3 testapp-3 (out): Sun Feb 15 2015 20:31:56 GMT+0000 (GMT)

thomaschaaf commented 9 years ago

@askhogan have you tried not using any of the logging flags? Or if you are using the json then removing them? That has helped for me.

    -l --log [path]                      specify entire log file (error and out are both included)
    -o --output <path>                   specify out log file
    -e --error <path>                    specify error log file
askhogan commented 9 years ago

@thomaschaaf Thanks for pointing that out in your earlier comments. That lead me to figuring out the issue for my config. I change merge_logs from true to false and now I am getting just one log entry. This is the new config I am using

{
  "apps" : [{
    "exec_interpreter"   : "node",
    "exec_mode"          : "cluster_mode",
    "instances"          : 0,
    "log_date_format"    : "YYYY-MM-DD HH:mm Z",
    "merge_logs"         : false,
    "name"               : "testapp",
    "script"             : "dashboard.js",
    "cwd"                : "_dashboard/server",
    "node_args"          : "--harmony",
    "ignoreWatch"        : ["[\\/\\\\]\\./", "log"],
    "watch"              : false
  }]
}
askhogan commented 9 years ago

I was able to recreate this using @bwin 's example

process.json

{
  "apps" : [{
    "exec_interpreter"   : "node",
    "exec_mode"          : "cluster_mode",
    "instances"          : 0,
    "log_date_format"    : "YYYY-MM-DD HH:mm Z",
    "max_memory_restart" : "160",
    "merge_logs"         : true,
    "name"               : "testapp",
    "script"             : "server.js",
    "cwd"                : "dashboard",
    "node_args"          : "--harmony",
    "ignoreWatch"        : ["[\\/\\\\]\\./", "log"],
    "watch"              : true
  }]
}

dashboard/server.js

var express = require('express');
var app = express();

app.get('/', function (req, res) {
    console.log("req", req.method, req.url);
    res.send('hey from pm2 worker #' + process.env.pm_id + '\n' + new Date());
});

var server = app.listen(3045, function () {
    var host = server.address().address;
    var port = server.address().port;
    console.log('Example app listening at http://%s:%s', host, port);
});

node 0.12 pm2 0.12.5

Run the command pm2 start process.json Run the command pm2 logs

Then run a GET command to http://localhost:3045/

pm2 logs outputs a console.log for each worker process

testapp-3 (out): 2015-02-15 14:55 -06:00: req GET /?
testapp-4 (out): 2015-02-15 14:55 -06:00: req GET /?
testapp-2 (out): 2015-02-15 14:55 -06:00: req GET /?
testapp-1 (out): 2015-02-15 14:55 -06:00: req GET /?
testapp-0 (out): 2015-02-15 14:55 -06:00: req GET /?
testapp-7 (out): 2015-02-15 14:55 -06:00: req GET /?
testapp-5 (out): 2015-02-15 14:55 -06:00: req GET /?
testapp-6 (out): 2015-02-15 14:55 -06:00: req GET /?

However in the GET request response you get just one reply from one worker

hey from pm2 worker #6
Sun Feb 15 2015 14:55:01 GMT-0600 (CST)
jshkurti commented 9 years ago

What I cannot wrap my head around is how PM2 takes a port (e.g. 3000) which is bound to a system resource and defined in each file that then gets clustered to 8 separate processes (which would inherently cause a port conflict).

As said earlier in this post, PM2 uses Node.js' cluster module. There is no port conflict because cluster module internally spawns some workers. Only the master listens on the port you chose and then redirects the requests to the most avaible worker communicating via IPC.

I'm closing this issue since it's a log problem and not a cluster_module one :)

askhogan commented 9 years ago

@jshkurti :) There was a confirmed bug in here though. I will re-open under another banner

thomaschaaf commented 9 years ago

Not entierly sure this should be closed...

askhogan commented 9 years ago

I get his point. Its the title and beginning, he doesnt want to confuse ppl. I opened a new ticket here for the confirmed logging issue --> https://github.com/Unitech/PM2/issues/1021

jshkurti commented 9 years ago

Thanks :)

bwin commented 9 years ago

Then please give a lightweight example to reproduce your error.

Edit: Oops, too late...

askhogan commented 9 years ago

@bwin read above and check ticket. I used your example.

mogongzi commented 8 years ago

Hi @askhogan,

Do you use log4js module under pm2 cluster mode successfully? Could you please refer the following issue? https://github.com/Unitech/pm2/issues/383