Open pablote opened 1 year ago
I too am wondering how to solve this problem
Next.js does some weird bundling stuff and have an odd global environment. There might be something off due to that? Probably better to ask in their support channels.
I don't think we've had requests to work in a browser context before, but that should be fine. Happy to take PRs allowing it.
I run the prom-client code locally on my computer and I was able to see what the problem was. Many of the default metrics run code outside of any function, so even if you don't use them, just by importing this code gets run. Some of that code uses Node.js libraries or grobals.
I changed all the cases that caused problems so that the code would be inside the function. See these couple examples:
After doing this the error went away. That said, this is still not working well, in strange ways. I need to keep testing around, but apparently this is not gonna be easy.
I don't think we've had requests to work in a browser context before, but that should be fine. Happy to take PRs allowing it.
It's not a browser environment. It's a server-side JS runtime, only that Node.js specific APIs are not allowed. I'm not sure how Vercel runs code, maybe it's a Lambda@Edge thing but not sure. I know that Cloudfare Workers have the same restrictions.
I've managed to get this working on Next.js, but it required quite a few changes on prom-client which I'm not sure if the project would be willing to consider merging @SimenB.
You can check the changes in this fork here: https://github.com/siimon/prom-client/compare/master...pablote:prom-client:feature/next-friendly-changes?expand=1. To summarize:
/metrics
options
second ctor param is now a fetch options, not a http/https module options, they are similar but not the same.Making require
s lazy and migrating to URL
global should both be fine to land at least. Let's start with that? Then fetch
is a natural one to target after that. Node 16 is EOL shortly, so we might get away with using global fetch
as well, but I wanna evaluate that separately
For simple default metrics, creating an api route works well in Next.js:
collectDefaultMetrics({});
const prometheus = async (_req: NextApiRequest, res: NextApiResponse) => {
res.setHeader('Content-type', register.contentType);
res.send(await register.metrics());
};
export default prometheus;
As for using prom-client
anywhere else I've only been able to do so in getServerSideProps
, as the middleware expects Vercel's edge runtime compatability. Quite a lot of (imo understandably) unhappy discussion about it here. It's very annoying for us as we are running dockerised node, we have nothing but the node runtime, and we get these edge runtime warnings.
In the end my getServerSideProps
all have a wrapper that looks something like this:
import { GetServerSideProps, GetServerSidePropsContext } from 'next';
import { Counter, Histogram, register } from 'prom-client';
// Even though it may look like nothing has been registered yet, if next
// recompiles due to HMR prom-client will throw errors when calling
// new Histogram()/new Gauge()/new Counter() as metrics with the same name
// already exist in the registry.
// https://github.com/siimon/prom-client/issues/196
register.removeSingleMetric('nextjs_gSSP_duration_seconds');
register.removeSingleMetric('nextjs_page_requests_total');
export const histograms = {
gSSPDurationSeconds: new Histogram({
name: 'nextjs_gSSP_duration_seconds',
help: 'Duration of gSSP in seconds',
labelNames: baseLabelNames,
buckets: [0.01, 0.05, 0.1, 0.3, 0.5, 0.7, 1, 3, 5, 7, 10],
}),
};
export const counters = {
pageRequestsTotal: new Counter({
name: 'nextjs_page_requests_total',
help: 'Count of page requests',
labelNames: baseLabelNames,
}),
};
// Register metrics on the default registry
register.registerMetric(histograms.gSSPDurationSeconds);
register.registerMetric(counters.pageRequestsTotal);
export const observedGetServerSideProps =
(gSSP: GetServerSideProps, route: string) => async (context: GetServerSidePropsContext) => {
// All incoming requests will have some derivable labels
const baseLabels = { route, foo: 'bar' }
// Run gSSP
const endGSSPDurationTimer = histograms.gSSPDurationSeconds.startTimer();
const response = await gSSP(context);
const duration = endGSSPDurationTimer(baseLabels);
// Capture metrics
histograms.gSSPDurationSeconds.observe(duration);
counters.pageRequestsTotal.labels(baseLabels).inc();
// Return gSSP response result
return response;
};
That is used like this
const unwrappedGetServerSideProps: GetServerSideProps = () => {
// your gSSP here
};
export const getServerSideProps: GetServerSideProps = observedGetServerSideProps(
unwrappedGetServerSideProps, '/[destination]'
);
I did find that the observedGetServerSideProps
had to be defined in another file, if I used something similar in the actual pages directory Next would complain at build, the same as your issue @pablote. I'm not entirely sure why a separate module gets around the warning, but Next doesn't mind 🤷🏼♂️
I would be very interested in improvements to allow us to use the middleware in Next rather than this wrapper, as currently we cannot use prom-client
for any statically optimised pages (no getServerSideProps
/getStaticProps
). And the wrapping isn't ideal, it's very all-or-nothing for adding metrics to a page.
Seems like we've run into a similar picture. Using prom-client
works well from getServerSideProps
and the get static one as well, also from API routes. The thing is you have to add that individually everywhere, and it still doesn't catch some calls. Middleware is the natural place to do that, but like I said above, it only works if I heavily modify prom-client
to not use anything Node.js specific.
I've seen some suggestions like using a custom server to host the Next.js app, but that's not an option for me.
That said, it looks like Next.js is integrating OpenTelemetry as a functionality, although it's still an experimental feature. So I'll probably not pursue this any further and see how that pans outs.
it looks like Next.js is integrating OpenTelemetry as a functionality
Unfortunately this is where we're at too - I've only added prom-client
in a couple of our next.js sites, but we have 10+ across various teams, and I need to recommend something for usage everywhere. That said I would like to avoid OpenTelemetry if I can, as we're already all-in on prom w/ push gateway for our java services.
It doesn't have to be one or the other I think. You can have an app using OT pushing metrics to a OT collector (this is very similar to pushing prom metrics to a push gateway). And then have Prometheus consume those from the collector, check this out: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/examples/demo
That said, last time I tried this in Next.js, it was only pushing trace/spans style metrics, and not counter/gauge/histogram prom style metrics. Not sure if it's a WIP and they plan to add that in the future.
For simple default metrics, creating an api route works well in Next.js:
collectDefaultMetrics({}); const prometheus = async (_req: NextApiRequest, res: NextApiResponse) => { res.setHeader('Content-type', register.contentType); res.send(await register.metrics()); }; export default prometheus;
@harry-gocity @pablote I've seen a few examples online with this approach, can you confirm that calling
collectDefaultMetrics({})
outside of the API handler/inside the API handler file works as intended?
I am trying to migrate a Next app away from using a custom server and currently collectDefaultMetrics
is being called when the custom server is initially invoked. My intuition is that this function should be called globally when the application is started, and so I'm wondering if calling it within the api/metrics.ts
will work as expected. I have tried to move this invokation inside the instrumentation.ts
file but with no luck as of yet (I get a 200 empty response)
can you confirm that calling collectDefaultMetrics({}) outside of the API handler/inside the API handler file works as intended?
We've since upgraded to next 14 (still pages dir) and now have an instrumentation.ts
that looks like this:
export async function register() {
console.log('Configuring instrumentation');
if (process.env.NEXT_RUNTIME === 'nodejs') {
const prom = await import('prom-client');
console.log('Configuring default prom metrics');
prom.collectDefaultMetrics({});
console.log('Creating custom prom metrics');
new prom.Histogram({
...
});
new prom.Counter({
...
});
new prom.Counter({
...
});
}
}
We don't call collectDefaultMetrics
anywhere else. And to access the metrics, we use register.getSingleMetric
rather than a reference to the created metric directly.
Thanks @harry-gocity I was missing the instrumentationHook
flag in the next config file. Two questions:
await prom.register.metrics();
at the end of this register function?At the end of the register block I am able to call getSingleMetric()
and it returns the correct value, however when I try to do the same thing in an API route I get undefined
:
import { NextApiRequest, NextApiResponse } from 'next';
import { register } from 'prom-client';
const handler = async (_: NextApiRequest, res: NextApiResponse) => {
const metric = register.getSingleMetric(
'my_app_http_request_count',
);
console.log(metric); // returns undefined
console.log(register.getMetricsAsArray()); // returns []
res.send('Text');
};
export default handler;
Here is my exact instrumentation.ts
file:
export async function register() {
console.log('Configuring instrumentation');
if (process.env.NEXT_RUNTIME === 'nodejs') {
const prom = await import('prom-client');
console.log('Configuring default prom metrics');
prom.collectDefaultMetrics({
prefix: 'my_app',
});
console.log('Creating custom prom metrics');
new prom.Counter({
name: 'my_app_http_request_count',
help: 'Count of HTTP requests made to the app',
labelNames: ['method'],
});
await prom.register.metrics(); // is this needed?
console.log(prom.register.getMetricsAsArray()); // returns correct data
console.log(prom.register.getSingleMetric('my_app_http_request_count')); // returnes correct data
}
}
Do you call await prom.register.metrics(); at the end of this register function? Do you have an API route where you expose all of the metrics to be scraped from?
We don't call prom.register.metrics()
in instrumentation - that's the function call to make to get the actual metrics string. You should call it in an api route like this:
import { NextApiRequest, NextApiResponse } from 'next';
import { register } from 'prom-client';
const prometheus = async (_req: NextApiRequest, res: NextApiResponse) => {
res.setHeader('Content-type', register.contentType);
res.send(await register.metrics());
};
export default prometheus;
After a lot of debugging I finally found what was missing. For anyone who has a similar issue, in my next.config.js
I needed both the instrumentationHook
and:
...
experimental: {
instrumentationHook: true,
serverComponentsExternalPackages: ['prom-client'],
},
...
and then calling await register.metrics()
works within an API handler.
Also in relation to the original issue, while you can't increment a counter directly within the middleware.ts
file, you can call to an exposed API from the middleware (example /pages/api/metrics/counter.ts
) and increment this way.
I'm having issues making this work on production build. I setup instrumentation as per this example https://github.com/siimon/prom-client/issues/584#issuecomment-2202375969 , in the api route called register.metrics()
and added experimental next config. If I run locally (yarn dev
) everything works as expected and I'm able to increase counter from another api route or to get all the metrics from the metrics endpoint. But as soon as I run the build, this stops working.
In the logs I do see that instrumentation code runs and inside the instrumentation I'm able to get counter value or any other metric but once I call metrics route, I receive empty response.
What could be causing this to behave differently in production build?
Note: Using NextJS app router.
FWIW I've somewhat started to give up with this lib - we're now using @vercel/otel
and @opentelemetry
packages instead, and scraping metrics with a prometheus exporter instead. The integration with Next.js is much cleaner, we have better tracing on GSSP (segmented by component load, fetch calls, and render) and we don't have to jump through bundling hoops to get a build passing. The only thing we still use this library for is the default metrics, the custom ones are all being migrated.
Just FYI I was able to resolve my issue, left more in-depth explanation here: https://github.com/siimon/prom-client/issues/640#issuecomment-2316946778
@harry-gocity can you share some examples using @vercel/otel
and @opentelemetry
?
@santosguilherme
I suppose this might help you.
https://github.com/vercel/next.js/discussions/16205#discussioncomment-9360516
Hi, I know this library is supposed to be framework agnostic, but it seems to be generating a bad interaction with Next.js. I have a counter defined like this:
This counter works well if used from within an API route for example. Thing is, I want it to catch every single request, so I use it where it makes most sense, in a middleware:
Doing this throws the following error:
This might not make a lot of sense if you don't know Next.js but some code, like the middleware, runs on a different JS runtime where it cannot be assumed that Node.js global APIs are available. My main question is: why would a counter.inc() call process.uptime? Seems unnecessary.
Ideally, no Node.js specific APIs should be used, only standard Web APIs, unless specifically required (and being able to opt-out).
thanks for the help