open-telemetry / opentelemetry-js

OpenTelemetry JavaScript Client
https://opentelemetry.io
Apache License 2.0
2.75k stars 808 forks source link

Metrics with NodeJS cluster mode #1252

Open tiagonapoli opened 4 years ago

tiagonapoli commented 4 years ago

I'm trying to use the Prometheus exporter with cluster mode, however each worker tries to spawn a new server to export metrics, which results in error. I thought of creating the MeterProvider only on the master process, so the exporter would be initialized only once, but I think this wouldn't work, since the workers would need to use the meter provider created.

The prom-client, which Prometheus exporter is based on, mentions how to use with cluster mode: https://github.com/siimon/prom-client#usage-with-nodejss-cluster-module

Node.js's cluster module spawns multiple processes and hands off socket connections to those workers. Returning metrics from a worker's local registry will only reveal that individual worker's metrics, which is generally undesirable. To solve this, you can aggregate all of the workers' metrics in the master process. See example/cluster.js for an example.

How to setup @opentelemetry/metrics to use with cluster mode? Should I create a custom exporter on the workers to send the metrics to master which then would export to prometheus? I have just started learning how to use opentelemetry so I don't have any ideas :(

vin-mad commented 2 years ago

Hi @tiagonapoli facing the same issue...did you a solution for this

pkarakal commented 2 years ago

Has anyone solved this? Otel metrics lib is hitting stable soon and it would be great if it could support this as many production nodejs applications are deployed in cluster mode (eg with pm2).

legendecas commented 2 years ago

I believe this is common to Node.js use cases and prom-client also provides support for this.

tiagonapoli commented 2 years ago

Unfortunately I haven't found a solution at the time, and has been some time I haven't been working with Node.js

dyladan commented 2 years ago

Hmm i'm not an expert in the cluster module although i have used it in the distant past. I wonder how much work it would be to support this or if it would just be a matter of properly documenting a setup process.

legendecas commented 2 years ago

I think we can provide a contrib package to support cluster metrics collection, rather than in the sdk-metrics. I'm planning to work on a POC to see what we can provide with that package.

jdmarshall commented 1 year ago

I've looked at the prom-client example for this, and the library exposes logic that can be used for taking the metrics returned from a cluster worker and manipulating it to do the aggregation. The good news/bad news is that a lot of the example uses functionality from the library.

It's not clear to me that there are similar facilities in opentelemetryjs. This codebase is not exactly built for understanding by trace debugging either.

I think you'd be stuck implementing your own metric parsing and coalescing code, which is why it would be good if someone with inside knowledge tried to do this instead of leaving it as an exercise to the user.

jdmarshall commented 1 year ago

@tiagonapoli any progress on solving this problem for your own project?

baba4ba commented 1 week ago

If it's ready, please give me a kick.