operator-framework / operator-sdk

SDK for building Kubernetes applications. Provides high level APIs, useful abstractions, and project scaffolding.
https://sdk.operatorframework.io
Apache License 2.0
7.17k stars 1.74k forks source link

Scorecard shows warnings from k8s API response when Operator making requests to deprecated REST APIs #5133

Open camilamacedo86 opened 3 years ago

camilamacedo86 commented 3 years ago

Feature Request

Describe the problem you need a feature to resolve.

I'd like to see in the default Scorecard tests results about the deprecation/removal of APIs in the next versions.

Goal

The goal is to check if Scorecard could gathering the info that is already provided by Kubenertes API when the operator is running on the cluster does request using deprecated APIs.

No Goal
Motivation

The Scorecard is the SDK feature that is capable of running tests on the cluster. In the same way that is recommended to perform regression tests on the projects and then, check its events/metrics to see if any warnings about deprecated APIs were raised it would be gathered by Scorecard default tests and appended to its results.

Use Case

I am as operator author, I would like to be able to gather the warnings raised by the k8s API when my operator is running and tested by Scorecard on the cluster then, I would be able to be easily informed beforehand that my operator is using/doing requests to deprecated APIs

Describe the solution you'd like.

Implementation in the Scorecard checks to looking for the raised events/metrics in the K8S API to gathering its WARNINGS and return as a result of its tests. More info: https://kubernetes.io/blog/2020/09/03/warnings/#deprecation-warnings. E.g:

1) Run Scorecard tests to trigger the reconcile by for example applying its CRs on the cluster 2) Then, gathering the metrics raised by K8s API and append its result in the Scorecard results. (e.g We can check the warnings by looking at the events e.g kubectl get events --field-selector="reason=AppliedWithWarnings" --all-namespaces) :

image

OR

kubectl get --raw /metrics | prom2json | jq ' .[] | select(.name=="apiserver_requested_deprecated_apis").metrics[].labels '

3) Appending the WARNINGs to the Scorecard results

Aditional Context:

OCP introduce Prometheus alerts using k8s metrics Two alerts have been introduced with OpenShift 4.8

More information on alerts, how to retrieve them or how to get notified is available in OpenShift documentation.

/language go /language ansible /language helm

tlwu2013 commented 3 years ago

It could be just me, but this “reporting the deprecated/removed APIs” sounds like a really handy scorecard test that is worth becoming a default test and benefits all Operators

openshift-bot commented 2 years ago

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close. Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

camilamacedo86 commented 2 years ago

/lifecycle frozen

camilamacedo86 commented 2 years ago

Following is aditional information that can help us to do that as a proposed solution.

Proposed solution

Develop a Scorecard custom test to identify the deprecated APIs using the k8s metrics and raise warnings based on the Operator Services (CRs) usages. This test ought to work dynamically and without the need to be updated and supplemented every time an API is flagged as deprecated.  

What value does it can bring? 

TL'DR: Technical Details  

Is it enough just to install the operator?

It probably requires applying all CRs to ensure that the operator will hit the deprecated apis in 1.25, 1.26 for e.g and trigger the Kube metrics. 

How to use the Kube metrics/alerts?

We are able to run the following command. But that is not good because we are not removing the requests that were not done by the operator.

We can shape/develop a command such as follows to get the metrics but we need to find out how to try to exclude what was not done by the operator.

kubectl get --raw /metrics | prom2json | jq '
  # set $deprecated to a list of deprecated APIs
  [
    .[] | 
    select(.name=="apiserver_requested_deprecated_apis").metrics[].labels |
    {group,version,resource}
  ] as $deprecated 

  |

  # select apiserver_request_total metrics which are deprecated
  .[] | select(.name=="apiserver_request_total").metrics[] |
  select(.labels | {group,version,resource} as $key | $deprecated | index($key))
'

We might be able to:

Have we an example scenario? 

An operator that is using these removed APIs in 1.25 for we use to test is https://github.com/keycloak/keycloak-operator. See: https://github.com/keycloak/keycloak-operator/blob/996c21bca9f1d948e784d4f1ef5caaba5088944e/pkg/model/postgresql_aws_periodic_backup.go#L6 the deprecated/removed API will be called to reconcile the KeycloakBackup CR because it will create the batch resource on the cluster using k8s.io/api/batch/v1beta1

How to create a custom Scorecard test? 

https://sdk.operatorframework.io/docs/testing-operators/scorecard/custom-tests/

What this custom Scorecard would need to be able to do?