aws-powertools / powertools-lambda-typescript

Powertools is a developer toolkit to implement Serverless best practices and increase developer velocity.
https://docs.powertools.aws.dev/lambda/typescript/latest/
MIT No Attribution
1.58k stars 139 forks source link

Feature request: support high resolution metrics #1277

Closed dreamorosi closed 1 year ago

dreamorosi commented 1 year ago

Use case

Amazon CloudWatch announced support for high resolution metric extraction from structured logs (EMF). Customers can now provide an optional StorageResolution parameter within the EMF specification with a value of 1 or 60 (default) to indicate the desired resolution (in seconds) of the metric.

We should consider adding support for this new optional parameter to Metrics.

The EMF log should have this format:

{
  "_aws": {
    "CloudWatchMetrics": [
      {
        "Metrics": [
          {
            "Name": "Time",
            "Unit": "Milliseconds",
            "StorageResolution": 60 // <- new key
          }
        ],
        ...
      }
    ]
  },
  "Time": 1
}

As part of this issue we should also update the API docs, documentation, and unit/integration tests.

Solution/User Experience

Proposed DX:

import { Metrics, MetricUnits, MetricResolutions } from '@aws-lambda-powertools/metrics';

const metrics = new Metrics();

export const handler = async (_event: any, _context: any): Promise<void> => {
    // Publish a metric with standard resolution i.e. StorageResolution = 60
    metrics.addMetric('successfulBooking', MetricUnits.Count, 1, MetricResolutions.Standard);
    // Publish a metric with high resolution i.e. StorageResolution = 1
    metrics.addMetric('failedBooking', MetricUnits.Count, 1, MetricResolutions.High);

    // The last parameter (storage resolution) is optional
    metrics.addMetric('successfulUpgrade', MetricUnits.Count, 1);
};

To support the proposal, we should add a new MetricResolutions enum that looks similar to this:

enum MetricResolutions {
  Standard = 60,
  High = 1,
}

supported by types like:

type MetricResolutionStandard = MetricResolutions.Standard;
type MetricResolutionHigh = MetricResolutions.High;

type MetricResolution = MetricResolutionStandard | MetricResolutionHigh;

export {
  MetricResolution,
  MetricResolutions,
};

as well as add a new optional parameter to the Metrics.addMetric() method, which would change its signature to:

public addMetric(name: string, unit: MetricUnit, value: number, storageResolution?: MetricResolution): void

Note The EMF specification states that if a value is not provided, then a default value of 60 is assumed (aka standard resolution). For this reason, if the storageResolution parameter is not specified we won't add the resolution.

The change will likely also require modifying the logic of the Metrics.storeMetric private method to allow the new optional field.

Alternative solutions

No response

Acknowledgment

niko-achilles commented 1 year ago

@dreamorosi i can contribute to this new feature .

dreamorosi commented 1 year ago

Thank you Niko, looking forward to collaborate on this one.

Before starting with the implementation it'd be interesting to give some thought on whether storageResolution is the best name for the new parameter. In my proposal I used that name because of how it's called in the EMF specification, however thinking forward to when we might support other providers we should choose a name that fits with most. Can we research if this type of parameter exists in other providers, and if so have a discussion on how to call it? For now I'd start with the main observability providers like Datadog, Prometheus, New Relic, etc. (suggestion from @heitorlessa)

niko-achilles commented 1 year ago

@dreamorosi and @heitorlessa given a DX proposal is expressed in the use case description , and taking under consideration as mentioned in the comment about supporting providers , a research about the parameter name is an efficient cause for word choice.

I have some questions for the research.

StorageResolution is a property name of the EMF Data Model, noun phrase, group of nouns Storage and Resolution. Refers to Decision/Action. The name of that property reflects that CloudWatch stores the metric with... resolution .

The research is about the parameter name: storageResolution in the function signature

public addMetric(name: string, unit: MetricUnit, value: number, storageResolution?: MetricResolution): void

The research should answer the following questions in steps:

Is the parameter name storageResolution based on meaning ? Are other providers chosen a word based on meaning ? is the meaning the same as cloudwatch concept ?
Do observability specifications exist in this field of Metrics ? What word have the authors of the specification chosen ? Is the word based on meaning ? Is the parameter name storageResolution descriptive enough to use with its type to determine the meaning in usage scenarios ? Does the type MetricResolution provide assistance to a developer as guide for allowed values ?
Is the plural phase MetricResolutions of the enum descriptive to choose the right value or decide to use the default value ?

dreamorosi commented 1 year ago

Hey Niko, thanks for breaking down the topic.

I would say these two:

Are other providers chosen a word based on meaning ? is the meaning the same as cloudwatch concept ? Do observability specifications exist in this field of Metrics ? What word have the authors of the specification chosen ? Is the word based on meaning ?

As you correctly point out, the current storage resolution comes from the EMF specification, which we support today, so we might still need/want to stick to it.

I think at this stage we just wanted to investigate if there's any similar parameter that covers a similar concept in other providers, but just for reference, Powertools for Python chose to go with the EMF parameter name and already made a release earlier today, so maybe we can do the same?

niko-achilles commented 1 year ago

I think at this stage we just wanted to investigate if there's any similar parameter that covers a similar concept in other providers, but just for reference, Powertools for Python chose to go with the EMF parameter name and already made a release earlier today, so maybe we can do the same?

Andrea, yes of course . because storageResolution is based on meaning, and also pascalCase which is a Typescript style

Should i continue with these questions ?

heitorlessa commented 1 year ago

Please allow me to take time to respond this next week, as soon as I have a break.

@leandrodamascena had a quick research on similar systems (e.g., Prometheus), and "resolution" parameter name was the closest "neutral" to this.

When we get to scope providers, we will have more thorough discussions and Tenets, as to not fall into the trap of optimising for the lowest common denominators and accidentally back ourselves into a corner we can't innovate/disagree.

Have an excellent and joyful weekend everyone!

niko-achilles commented 1 year ago

Please allow me to take time to respond this next week, as soon as I have a break.

Hi Heitor, i am commenting because i invested time early in the morning, the main trigger is curiosity . This additional comment may bring things together for further discussion with Andrea and Leandro.

@leandrodamascena had a quick research on similar systems (e.g., Prometheus), and "resolution" parameter name was the closest "neutral" to this.

Is the research based on meaning ? Because the motivation, usage scenario and end cause of resolution in Prometheus is a different one to my understanding in comparison to an act with meaning of storage and resolution . Reference Link: https://prometheus.io/blog/2019/01/28/subquery-support/

Can we look at the current research of Leandro, for learning ? Furthermore i liked this article about Relationship between Flush Intervals in StatsD and the Time Resolution of Metrics in the Database, link: https://hostedmetrics.com/articles/introduction-to-statsd/

and also looked how aws-embedded-metrics-node uses the Storage Resolution concept: https://github.com/awslabs/aws-embedded-metrics-node/blob/master/src/logger/MetricsLogger.ts#L134

When we get to scope providers, we will have more thorough discussions and Tenets, as to not fall into the trap of optimising for the lowest common denominators and accidentally back ourselves into a corner we can't innovate/disagree.

Do you mean that the approach of deriving meaning and usage scenario is the balanced efficient method ?

Additional curiosity of me by looking at the code of Metrics in powertools for typescript: Why is the method named for publishing named publishStoredMetrics ?

In lambda given entry-point, handler and event driven, the usage scenario is publishing metrics at end of handler invocation, automatically or manually. Depends on the mechanics used by a dev. like decorator or middleware or manual .
An interval concept has no meaning. Then what was the rationale behind naming the method publishStoreMetrics ? Why not publishMetrics ?

https://github.com/awslabs/aws-lambda-powertools-typescript/blob/db215dd80c9a18a612c32c9fa5e373af1c110476/packages/metrics/src/Metrics.ts#L308

dreamorosi commented 1 year ago

Hi Niko, thank you for your patience. After discussing this internally first, and then afterwards with you earlier today, we have decided to move forward with naming the new parameter simply resolution.

At the moment we are focusing only on EMF as a metrics target so it makes sense to align with their naming conventions. Once we decide to open up to more providers we'll likely have to make a major release anyway, so we'll be able to have a proper discussion and potentially decide to change the name. However I appreciate you already having spent some time looking into this.

So to recap, we are going to follow the specification in the first message of this thread, with the exception of the new parameter being called resolution.

Below I also reply to your other questions, however I've inserted them in a collapsible block, so that we keep the conversation focused:

> Additional curiosity of me by looking at the code of Metrics in powertools for typescript: > > Why is the method named for publishing named `publishStoredMetrics` ? > > Then what was the rationale behind naming the method publishStoreMetrics ? Why not publishMetrics ? To be 100% honest I don't remember the thought process that made us decide that instead of just `publishMetrics`, but I would say that adding the word `stored` maybe makes it more explicit that there's a buffer of metrics that have been created but haven't been emitted yet. If it was called only `publishMetrics` as a new user I might think that this is the method I need to use to create a metric (which is instead `addMetric`). Basically the idea behind Metrics as an utility is to convey the image of you having to add metrics to a store/buffer throughout your code; and then at some point publish them all. Now, I agree that in most cases the publishing might coincide with the end of an invocation, however ether are cases in which I might want to delay or speed up this publishing. I.e. I want to publish metrics only when certain things happen, and I want to do that immediately without waiting for the function to end. > why use enums (from Discord) I think we decided to go with Enums over classes and objects mostly because of the advantages described [in the docs](https://www.typescriptlang.org/docs/handbook/enums.html#objects-vs-enums). Note that this was before we had the `as const` in TypeScript, which would allow us to get some of the benefits with objects.
niko-achilles commented 1 year ago

Yes as discussed with Andrea and Heitor, the parameter name will be resolution and the function definition will be as follows:

public addMetric(name: string, unit: MetricUnit, value: number, resolution?: MetricResolution): void

My next steps would be to introduce a PR as draft and discuss with you the following questions :

For the PR as draft i will consult the following:

Is that OK for you @dreamorosi , @heitorlessa and @leandrodamascena ?

@dreamorosi an additional question specific to typescript , should i use enum for consistency reasons ? Or am I allowed to to introduce in the PR as draft, readonly key value pairs as object , the same as const does for the types ?

dreamorosi commented 1 year ago
public addMetric(name: string, unit: MetricUnit, value: number, resolution?: MetricResolution): void

This looks good.

  • Does the type MetricResolution provide assistance to a developer as guide for allowed values ?
  • Is the plural phase MetricResolutions of the enum descriptive to choose the right value or decide to use the default value?

You're right, let's go with singular: MetricResolution.

For the PR as draft i will consult the following:

Is that OK for you @dreamorosi , @ heitorlessa and @ leandrodamascena ?

It's okay to look at the implementations on these two repos. I don't know how close they will be with how Metrics works in Powertools.

One of the goals of the PR should be to implement the feature with the least amount of changes possible. Less changes in other areas of the code are better.

@dreamorosi an additional question specific to typescript , should i use enum for consistency reasons ? Or am I allowed to to introduce in the PR as draft, readonly key value pairs as object , the same as const does for the types ?

Let's try your proposal and see how it looks. Once you have a first draft I'll checkout the branch and see how it would be from a user perspective, based on that I'll decide if we keep the proposed object or revert to enum like the others.

github-actions[bot] commented 1 year ago

⚠️ COMMENT VISIBILITY WARNING ⚠️

Comments on closed issues are hard for our team to see. If you need more assistance, please either tag a team member or open a new issue that references this one. If you wish to keep having a conversation with other community members under this issue feel free to do so.