Open simonireilly opened 3 years ago
Thanks for raising @simonireilly.
I can see the value in having this. Could you provide a bit more detail about what you'd expect to have in CloudWatch? Ideally we would provide some level of feature parity with the newly announced Next.js Analytics.
My only concern is whether CloudWatch is geared well enough for this (especially in terms of Dashboards / Visualisations).
Adding a bit more detail about exactly what features you'd like to see and how it would be implemented in CloudWatch would help moving things more quickly.
@danielcondemarin, sure, I can elaborate.
It is probably best to begin with the implementation in NextJS source code.
https://www.github.com/vercel/next.js/tree/canary/packages%2Fnext%2Fclient%2Fperformance-relayer.ts
This performance-layer.ts
module in required in client/index
. It is mounted in a useEffect
hook to run after DomContentLoaded has occured.
Line 49 has the hardcore's endpoint for vercel-analytics.com
. I would expect this to be a process.env
in the future as this proprietary endpoint hard coding is not OSS in my opinion.
This module will observe the metrics required from the web-vitals
package
Capturing these as Custom CloudWatch Metrics would be the proposed solution. There could be unique metric names for each of these five webvitals per deployment.
MyNextServerlessCLS
MyNextServerlessFCP
MyNextServerlessFID
MyNextServerlessLCP
MyNextServerlessTTFB
MetricDatum
can be seconds as sent in the value field of the body from the nextJS performance-layer.ts
.
These can be visualised in percentiles as the user wants, but defaults of P75 (recommended bench marker https://web.dev/vitals/) would be ideal for any dashboard.
To enhance these metrics it would be possible to add some dimensions.
Device='mobile' | 'tablet' | 'desktop'
Page=${body.page}
( Next page .e.g. /[slug].js
Region=${process.env.AWS_REGION}
(default lambda env)It is worth noting that you cannot aggregate custom metrics along multiple dimensions, being overly specific means you cannot re-aggregate for the global metric.
These dashboards are some what limited but the main components can be achieved:
The last piece is the pages section. Having this dimension will be helpful I think, but visualizing so many dimensions might be overload.
Some trial and error might be required.
Hope that makes sense. The architecture would be as described previously.
Hey @simonireilly Thanks for the great level of detail!
Line 49 has the hardcore's endpoint for vercel-analytics.com. I would expect this to be a process.env in the future as this proprietary endpoint hard coding is not OSS in my opinion.
Surprised to see this! Sounds like we need to raise an issue in Next.js first to sort that out.
Also might be worth thinking if the distributed nature of Lambda@Edge CloudWatch Logs affects anything!
Sounds good, I have opened an issue, we will see if there is any desire for the framework to make the change.
https://github.com/vercel/next.js/issues/18907
This is a simple implementation in pages/api/v1/vitals
:
import { NextApiRequest, NextApiResponse } from 'next'
import CloudWatch, { MetricDatum } from 'aws-sdk/clients/cloudwatch';
const cloudWatchClient = new CloudWatch({ apiVersion: '2010-08-01' })
const params = (webVital: WebVital): MetricDatum => ({
MetricName: webVital.event_name,
Dimensions: [
{
Name: 'NextJSPage',
Value: webVital.page
},
],
Unit: 'Milliseconds',
Value: parseFloat(webVital.value)
});
const handler = async (req: NextApiRequest, res: NextApiResponse) => {
try {
const webVital: WebVital = req.body
const metricData = params(webVital)
await cloudWatchClient.putMetricData({
MetricData: [metricData],
Namespace: 'NextJsApplication'
}).promise()
return res.send(200)
} catch (err) {
console.error(err)
}
return res.status(422).json({
error: 'Failed to send Metrics, check server/lambda logs for details'
})
}
export default handler
type WebVital = {
dsn: string
id: string
page: string
href: string
event_name: string
value: string
speed: string
}
You receive each webVital, for each page as a Metric to aggregate using any possible cloudwatch functions. A simple number board below gives the p75 of all metrics for all pages over a day in all regions.
Looks good, I would say to use the AWS SDK v3 for this though, since it is modular it has very low single-digit ms cold start times. We are using that for S3 calls within the handler.
For CloudWatch API call, will you be sending to a single region or it is distributed to the closest region from where the Lambda@Edge was invoked?
Looks good, I would say to use the AWS SDK v3 for this though, since it is modular it has very low single-digit ms cold start times. We are using that for S3 calls within the handler.
👍
For CloudWatch API call, will you be sending to a single region or it is distributed to the closest region from where the Lambda@Edge was invoked?
That is maybe up for discussion, would be good to know if cross-region metrics are desirable before committing to an architecture and cost for them.
Options I would say are:
If latency is to be truly minimised then firing a lambda asynchronously would be the best bet. This means we just fire the body to the API, we don't wait for the cold boot, or HTTPS handshake between Lambda and CloudWatch API, or anything really. Lambda will handle queuing this up and retrying.
@dphang Is this possible on the edge? I am not sure it is.
Final thing, it's a no from NextJS for making this extensible https://github.com/vercel/next.js/issues/18907#issuecomment-723300381
With that being said this is potentially a non-starter as it would require custom implementation. I don't believe that is an attractive proposition but if there is still an interest in the feature then it can be done 🤷♂️
I think Lambda@Edge is pretty similar to Lambda right now, there aren't much limitations anymore (for origin handlers) save for the environment variables and no provisioned concurrency.
I think you can make an async call and don't wait for the response. But I thought this is a reporting API from client side anyway, i.e it doesn't block rendering of the page itself? I haven't used this new feature yet so not as familiar with it.
I did see from here that you can send the metrics to any endpoint, e.g the example they gave was for Google Analytics. I guess you want to build this into the Lambda@Edge itself to send data to CloudWatch instead?
With that being said this is potentially a non-starter as it would require custom implementation. I don't believe that is an attractive proposition but if there is still an interest in the feature then it can be done 🤷♂️
What do you think if we introduce our own performance-relayer
client implementation? To start with it could be the similar or same as Next.js Vercel one.
So if users opt-in to the Analytics functionality we'd bootstrap the backend Analytics endpoint in Lambda@Edge for them.
component: @sls-next/serverless-component
inputs:
analytics: true
Client side they could install an NPM module, e.g.
# pages/_app.js
export { default as reportWebVitals } from '@sls-next/analytics-client';
Later on, we could provide some way to allow for extensibility.
I think you can make an async call and don't wait for the response. But I thought this is a reporting API from client side anyway, i.e it doesn't block rendering of the page itself? I haven't used this new feature yet so not as familiar with it.
That's right @dphang it doesn't block rendering. Ideally we would support using the Beacon API which is generally more efficient than using fetch
directly and it handles when metrics are sent and the page is unloaded (e.g. an external link click).
just wanting to drop in and say this would be mega cool, thanks for all the hard work ^
Is your feature request related to a problem? Please describe.
With
reportWebVitals
in a custom_app.js
we can support recording Web Vitals as per docs: https://nextjs.org/docs/advanced-features/measuring-performance#web-vitalsDescribe the solution you'd like
/analytics
endpoint.Describe alternatives you've considered
Two popular managed services for this:
These are good services, but they are paid services.
Additional context
Background reading: https://developer.mozilla.org/en-US/docs/Learn/Performance/Measuring_performance