keptn / lifecycle-toolkit

Toolkit for cloud-native application lifecycle management
https://keptn.sh
Apache License 2.0
310 stars 122 forks source link

Document how to write / create a new KeptnMetricsProvider #1248

Closed agardnerIT closed 8 months ago

agardnerIT commented 1 year ago

Currently, prometheus, datadog, dynatrace and (dynatrace) dql are supported KeptnMetricsProviders but it is expected that others will want to write new providers for other metric backends.

The documentation should contain an explanation about how to achieve this goal.

image

StackScribe commented 1 year ago

@prateek041 is working on this. Github won't let me officially assign it to him yet.

mowies commented 1 year ago

He needs to comment on this issue, then we can assign him.

prateek041 commented 1 year ago

Please assign this issue to me @mowies @StackScribe

agardnerIT commented 1 year ago

Some notes from the hands-on session at Kubecon to assist with this (@thisthat please validate my notes here as it was the end of a long week!)

Imagine you want to add a new metrics provider for a third party service called foobar:

  1. Fork this repo
  2. Modify common.go and add a new entry for foobar like this:
const FoobarProviderType = "foobar"
  1. Create your own new folder inside this folder matching the new service name: foobar
  2. Implement your metric retrieval logic inside your new folder. Follow prometheus.go as an example.

You must implement the EvaluateQuery function. Note that metric is a string for flexibility because some backends return things like 3m which means 0.003 so we're providing that flexibility.

Don't forget to add test cases.

  1. Add a new case statement to this code section:
    case FoobarProviderType:
        return &foobar.KeptnDataDogProvider{
            Log:        log,
            HttpClient: http.Client{},
            K8sClient:  k8sClient,
        }, nil

@thisthat I believe you mentioned that it's possible to code metrics providers in other languages (any other languages or only specific ones?). I think you mentioned TypeScript. Is there demo code / reference impl for this?

thisthat commented 1 year ago

Hey @agardnerIT , your list of steps is perfect 💯

Yes, technically you could implement a task that does the metric retrieval and checks for some thresholds. We have an example for Prometheus here: https://github.com/keptn/lifecycle-toolkit/blob/main/functions-runtime/samples/ts/prometheus.ts However, I would not recommend this approach because you would bypass the metric-server and hence, the results will not be available to other tools via the K8s metric API

prateek041 commented 1 year ago

Hey @thisthat @agardnerIT

Since I am documenting the process, I would like to add a provider on my own so that I can go through the process/steps and write better documentation.

So, What more metrics provider is Keptn community planning to integrate ? So, maybe I can work on one.

thisthat commented 1 year ago

Hey @prateek041 that's a great idea! At Kubecon, some folks talked about the need for metrics coming from cloud providers. Maybe AWS CloudWatch or GCP Metric

rakshitgondwal commented 1 year ago

Hey, @prateek041! Actually, this idea of adding additional metric providers is also a GSOC project and maybe @sudiptob2 is applying under that project. So I think it would be a good idea to ask him beforehand which metric providers is he planning to integrate if he gets selected.

prateek041 commented 1 year ago

Thanks for the input @rakshitgondwal I totally forgot that it was also there for GSoC, but since organisation is deciding what provider to implement, there won't be any conflict (we can implement separate operators), but still I will definitely talk to him about it on slack. Also, I applied in GSoC for the Backstage plugin.

In the meanwhile, what do you suggest @thisthat should I go ahead with one of GCP metric/AWS cloudwatch ?

sudiptob2 commented 1 year ago

Hi, thanks @rakshitgondwal for mentioning me in; In my proposal I have proposed to implement around 4 providers. But the exact providers will be decided after discussing with the mentors (if my proposal get selected). I think @prateek041 can go ahead with his provider of choice at this point.

rakshitgondwal commented 1 year ago

Hi, thanks @rakshitgondwal for mentioning me in; In my proposal I have proposed to implement around 4 providers. But the exact providers will be decided after discussing with the mentors (if my proposal get selected). I think @prateek041 can go ahead with his provider of choice at this point.

Amazing! All yours then @prateek041

thisthat commented 1 year ago

Thanks to everyone :) @prateek041, you can choose the one that you're more familiar with - or both 😝

agardnerIT commented 1 year ago

Perhaps a "dummy" provider (I know @thisthat mentioned this at kubecon but for the benefit of others on this thread).

  1. Create a simple "endpoint" that returns a value (perhaps a random value or one that cycles up and down from 1 to 10?)
  2. Document how to create a provider for this "dummy" endpoint

Doing this and documenting how would (I believe) help explain the concept, without readers being scared by the possibility "oh, I don't know how (vendor X) works so I can't read this documentation". For example, "this page is based on AWS Cloudwatch and I don't know / use AWS so I can't follow it, so I won't bother".

My 2c.

Edit: This endpoint actually already exists: GET http://www.randomnumberapi.com/api/v1.0/random so it's even easier now :)

PriyanshuAhlawat commented 1 year ago

@prateek041 If you want to go with one then you should go for aws as it is one of the most preferred choices for many developers. Since I have also applied for gsoc project for metric provider, I think aws will be a good choice.

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

agardnerIT commented 1 year ago

This should be re-opened as it is absolutely necessary.

mowies commented 1 year ago

@prateek041 if you still wanna work on this, i'll gladly re-assign you :)

github-actions[bot] commented 10 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

sudiptob2 commented 10 months ago

Hi, Is this issue still open? I think I can contribute to this issue but I would need some help from experienced maintainers. Basically, last year I implemented the Datadog provider so I have some experience in this. But to be honest, with a long gap, I forgot a lot of things. I can start over and write the guide if no one is currently working on it.

github-actions[bot] commented 9 months ago

This issue will be unassigned in 1 week if no further activity is seen. If you are active please provide an update on the status of the issue and if you would like to continue working on it.

agardnerIT commented 9 months ago

Re-open as this is necessary. @sudiptob2 are you still able to work on this?

sudiptob2 commented 9 months ago

@agardnerIT, yes still interested in this one. However, occupied with another issue at this moment. If anyone else is available, feel free to work on it.

geoffrey1330 commented 8 months ago

@mowies could you please assign this issue to me.

mowies commented 8 months ago

done! note, that somebody already started working on this but never finished, so there's an old and closed PR for this issue already that you could check out (#1382)

geoffrey1330 commented 8 months ago

done! note, that somebody already started working on this but never finished, so there's an old and closed PR for this issue already that you could check out (#1382)

Alright thanks I will check it out.

geoffrey1330 commented 8 months ago

Hi @mowies I'm facing some difficulties testing the dummy provider as the api am using is inconsistent with it output since it generating random number so therefore i want to change to something else let's say https://restcountries.com api that returns countries and information relating to them that way we have some consistency in the api output. cc @StackScribe

mowies commented 8 months ago

A different API is definitely fine, sure! But keep in mind that the biggest part of this ticket should be documentation on how to implement a new provider, not necessarily a new provider in itself. EDIT: I see you already added some docs, nice!