moleculerjs / moleculer

:rocket: Progressive microservices framework for Node.js
https://moleculer.services/
MIT License
6.13k stars 580 forks source link

CPU utilization-based Strategy should use process usage instead of os usage? #259

Closed Wallacy closed 6 years ago

Wallacy commented 6 years ago

Hello,

Looking at moleculer/src/cpu-usage.js, appear to me that current strategy for CPU is tied to entire machine cpu usage.

This is the best way to do that? Because if we run multiples services on the same OS (if inside of the docker probably will be fine) all the services will show the same CPU usage, ever only one is actually processing data. So, one microservice busy doing something has the same probability to receive the call as the one that is idle.

I try to change the implementation to use process.cpuUsage([previousValue]) ( https://nodejs.org/api/process.html#process_process_cpuusage_previousvalue ) but i did not get a good accuracy trying to math the correct CPU %.

Using "pidusage" (https://github.com/soyuka/pidusage) i get good results on Linux, but not a good one on Windows. Note that the "pidusage" get the correct CPU usage for the process per core, so for 100% on a 4 core CPU is 25% of the system, (and 100% of one core of course). But this is fine because NodeJs is one core application anyway, so the load balancer will choose correctly.

PM2 appear to get the correct number too.

Any thoughts?

icebob commented 6 years ago

Hello,

I deliberately did so. Because for example, there is an installed DB server which uses 99% of OS, and you have 3 running moleculer node with 0% PID cpu usage, it is not good idea to send requests to them.

Wallacy commented 6 years ago

Hello,

I thought about that, but also because NodesJs are a single thread application, if only one moleculer node is running at 99% of one CPU/System Thread, the total system workload can be 7% (I have 8 cores / 16 threads), and this process/service can receive calls even if there's another with 0%. And use 100% of one thread is not that difficult.

Do you think that we can weigh local processes on the same machine to avoid this? Maybe a simple 75% of the system cpu and 25% of process cpu should work for you exemple too because this 3 running moleculer nodes will be less likely to receive the call for others machines anyway. But will differ enough to avoid the case of one moleculer node at 100% receiving calls. (50%/50% probably will work too)

Or first check if the system is above X% (50-80-90% ?), and then if not check the process cpu, etc (and or make the mediam).

BTW, i'm testing the process usage using this:

'use strict'

function getCpuUsage(sampleTime = 200) {
    return new Promise(resolve => {
    const startTime  = process.hrtime()
    const startUsage = process.cpuUsage()
    setTimeout(() => {
      let elapTime = process.hrtime(startTime)
      let elapUsage = process.cpuUsage(startUsage)
      const elapTimeMS = secNSec2ms(elapTime)
      const elapUserMS = elapUsage.user / 1000
      const elapSystMS = elapUsage.system / 1000
      const cpuPercent = Math.round(100 * (elapUserMS + elapSystMS) / elapTimeMS)
      resolve({avg: cpuPercent });
        }, sampleTime);
    });
};

function secNSec2ms (secNSec) {
  return secNSec[0] * 1000 + secNSec[1] / 1000000
}

// Test
getCpuUsage().then((cpu)=>{
  console.log(cpu.avg);
});

var inTime = Date.now();
let loop = setInterval(()=>{
  var now = Date.now();
  while (Date.now() - now < 75);
  getCpuUsage().then((cpu)=>{
    console.log(cpu.avg);
  });
  if (now - inTime > 1000){
    clearInterval(loop);
  }
}, 150);

getCpuUsage().then((cpu)=>{
  console.log(cpu.avg);
});

Anyway, will be always a compromise for sure, i'm just trying to not put the less likely to rule the most common case. I think if someone is using moleculer on the server side, is high probability that person will avoid to make cenarios like this one (DB running at 99% on the same machine). A good route between moleculer nodes should be prioritized, 3rd party apps will always be a random variable.

icebob commented 6 years ago

NodeJS is not a single thread app, javascript can run only in a single thread. So if you make file IO, network IO, or something else which implemented natively as async, it will be running on multiple threads. And I think, the most of use cases, you will use file or DB accessing or networking in your services and not just add two numbers.

Btw, in the next version, I will expose the getCpuUsage from registry to the broker and you will able to override with your custom implementation like broker.getCpuUsage = function() {....}. And CpuUsageStrategy will use these values.

Wallacy commented 6 years ago

So if you make file IO, network IO, or something else which implemented natively as async, it will be running on multiple threads

Yes, this is true for every language btw.

And I think, the most of use cases, you will use file or DB accessing or networking in your services and not just add two numbers.

Without a survey this is only a guess. My "guess" was based on docker world. The golden rule is do not put more than one domain in the same context.

Btw, in the next version, I will expose the getCpuUsage from registry to the broker and you will able to override with your custom implementation like broker.getCpuUsage = function() {....}. And CpuUsageStrategy will use these values.

Thanks. This can solve the problem.

icebob commented 6 years ago

In v0.13 with new middlewares you can change it with a similar middleware:

// middleware.js
module.exports = function ProcessCpuUsageMiddleware() {
    return {
        created(broker) {
            broker.getCpuUsage = function() {
                const avg = 5.0; // Get process CPU usages somehow
                return broker.Promise.resolve({ avg });
            }
        }
    }
};

// broker options
const broker = new ServiceBroker({
    middlewares: [ProcessCpuUsageMiddleware()]
});
Wallacy commented 6 years ago

Thanks, I will try!