Closed slashpai closed 3 months ago
@slashpai: This pull request references Jira Issue OCPBUGS-35480, which is valid. The bug has been moved to the POST state.
The bug has been updated to refer to the pull request using the external bug tracker.
/retest-required
/retest-required
/test e2e-aws-ovn-techpreview
@simonpasquier can you review again?
/retest
The failure in tech preview job looks to be unrelated to this change
tested with PR,
still can see UpdatingPrometheusAdapter
in the CMO logs(fix for this is not included in this PR), there is not warning message for the deprecated PrometheusAdapter setting in the CMO logs either(do we need to log the warning message?).
$ oc -n openshift-monitoring logs deploy/cluster-monitoring-operator | grep "UpdatingPrometheusAdapter" | head -n6
I0619 04:05:53.203865 1 tasks.go:70] running task 7 of 16: UpdatingPrometheusAdapter
I0619 04:05:53.203870 1 tasks.go:76] ran task 7 of 16: UpdatingPrometheusAdapter
I0619 04:10:59.294312 1 tasks.go:70] running task 7 of 16: UpdatingPrometheusAdapter
I0619 04:10:59.294366 1 tasks.go:76] ran task 7 of 16: UpdatingPrometheusAdapter
I0619 04:12:28.365875 1 tasks.go:70] running task 7 of 16: UpdatingPrometheusAdapter
I0619 04:12:28.365926 1 tasks.go:76] ran task 7 of 16: UpdatingPrometheusAdapter
alertClusterMonitoringOperatorDeprecatedConfig
could be fired
$ token=`oc create token prometheus-k8s -n openshift-monitoring`
$ oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -H "Authorization: Bearer $token" 'https://thanos-querier.openshift-monitoring.svc:9091/api/v1/query?' --data-urlencode 'query=cluster_monitoring_operator_deprecated_config_in_use' | jq
{
"status": "success",
"data": {
"resultType": "vector",
"result": [
{
"metric": {
"__name__": "cluster_monitoring_operator_deprecated_config_in_use",
"configmap": "openshift-monitoring/cluster-monitoring-config",
"container": "cluster-monitoring-operator",
"deprecation_version": "4.16",
"endpoint": "https",
"field": "k8sPrometheusAdapter",
"instance": "10.130.0.15:8443",
"job": "cluster-monitoring-operator",
"namespace": "openshift-monitoring",
"pod": "cluster-monitoring-operator-59c7b4845d-q2mrz",
"prometheus": "openshift-monitoring/k8s",
"service": "cluster-monitoring-operator"
},
"value": [
1718774171.547,
"1"
]
}
],
"analysis": {}
}
}
$ oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -H "Authorization: Bearer $token" 'https://thanos-querier.openshift-monitoring.svc:9091/api/v1/query?' --data-urlencode 'query=ALERTS{alertname="ClusterMonitoringOperatorDeprecatedConfig"}' | jq
{
"status": "success",
"data": {
"resultType": "vector",
"result": [
{
"metric": {
"__name__": "ALERTS",
"alertname": "ClusterMonitoringOperatorDeprecatedConfig",
"alertstate": "pending",
"configmap": "openshift-monitoring/cluster-monitoring-config",
"deprecation_version": "4.16",
"field": "k8sPrometheusAdapter",
"prometheus": "openshift-monitoring/k8s",
"severity": "info"
},
"value": [
1718774477.234,
"1"
]
}
],
"analysis": {}
}
}
tested with PR, still can see
UpdatingPrometheusAdapter
in the CMO logs(fix for this is not included in this PR), there is not warning message for the deprecated PrometheusAdapter setting in the CMO logs either(do we need to log the warning message?).$ oc -n openshift-monitoring logs deploy/cluster-monitoring-operator | grep "UpdatingPrometheusAdapter" | head -n6 I0619 04:05:53.203865 1 tasks.go:70] running task 7 of 16: UpdatingPrometheusAdapter I0619 04:05:53.203870 1 tasks.go:76] ran task 7 of 16: UpdatingPrometheusAdapter I0619 04:10:59.294312 1 tasks.go:70] running task 7 of 16: UpdatingPrometheusAdapter I0619 04:10:59.294366 1 tasks.go:76] ran task 7 of 16: UpdatingPrometheusAdapter I0619 04:12:28.365875 1 tasks.go:70] running task 7 of 16: UpdatingPrometheusAdapter I0619 04:12:28.365926 1 tasks.go:76] ran task 7 of 16: UpdatingPrometheusAdapter
alert
ClusterMonitoringOperatorDeprecatedConfig
could be fired$ token=`oc create token prometheus-k8s -n openshift-monitoring` $ oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -H "Authorization: Bearer $token" 'https://thanos-querier.openshift-monitoring.svc:9091/api/v1/query?' --data-urlencode 'query=cluster_monitoring_operator_deprecated_config_in_use' | jq { "status": "success", "data": { "resultType": "vector", "result": [ { "metric": { "__name__": "cluster_monitoring_operator_deprecated_config_in_use", "configmap": "openshift-monitoring/cluster-monitoring-config", "container": "cluster-monitoring-operator", "deprecation_version": "4.16", "endpoint": "https", "field": "k8sPrometheusAdapter", "instance": "10.130.0.15:8443", "job": "cluster-monitoring-operator", "namespace": "openshift-monitoring", "pod": "cluster-monitoring-operator-59c7b4845d-q2mrz", "prometheus": "openshift-monitoring/k8s", "service": "cluster-monitoring-operator" }, "value": [ 1718774171.547, "1" ] } ], "analysis": {} } } $ oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -H "Authorization: Bearer $token" 'https://thanos-querier.openshift-monitoring.svc:9091/api/v1/query?' --data-urlencode 'query=ALERTS{alertname="ClusterMonitoringOperatorDeprecatedConfig"}' | jq { "status": "success", "data": { "resultType": "vector", "result": [ { "metric": { "__name__": "ALERTS", "alertname": "ClusterMonitoringOperatorDeprecatedConfig", "alertstate": "pending", "configmap": "openshift-monitoring/cluster-monitoring-config", "deprecation_version": "4.16", "field": "k8sPrometheusAdapter", "prometheus": "openshift-monitoring/k8s", "severity": "info" }, "value": [ 1718774477.234, "1" ] } ], "analysis": {} } }
we haven't removed PrometheusAdapter task in this PR. We have added only deprecation metric which will set to 1 if k8sPrometheusAdapter fields are defined in cluster-monitoring-config map
/label qe-approved
@slashpai: This pull request references Jira Issue OCPBUGS-35480, which is valid.
@juzhao We can add in cmo logs if any k8sPrometheusAdapter fields defined as well. I will update the PR
/hold
maybe combine with #2386?
@slashpai: This pull request references Jira Issue OCPBUGS-35480, which is valid.
The bug has been updated to refer to the pull request using the external bug tracker.
/hold need to test the configs on a cluster where metrics-server already running
@simonpasquier I think we need to test the config setting from prometheus-adpater to metrics-server if config exists a bit more as it can be error prone (from a quick test in cluster bot). Shall we have it in separate PR and have only the deprecation part in this PR?
Edit: Discussed offline and decided to keep only deprecation code in this PR
/cherry-pick release-4.16
@slashpai: once the present PR merges, I will cherry-pick it on top of release-4.16 in a new PR and assign it to you.
@simonpasquier can you review again?
/retest
@simonpasquier PTAL when you get a chance :)
/lgtm Thanks!
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: machine424, simonpasquier, slashpai
The full list of commands accepted by this bot can be found here.
The pull request process is described here
/retest-required
Remaining retests: 0 against base HEAD 7f52b1a0f0de84925ab93412a4e5127c000f774c and 2 for PR HEAD 51b540b4c56a037037604ddc7a0675dd741bb02e in total
/retest-required
Remaining retests: 0 against base HEAD 5d1fd1bb52eeb9b2f877c45de0cf93e2f9fffb95 and 1 for PR HEAD 51b540b4c56a037037604ddc7a0675dd741bb02e in total
/retest-required
/retest-required
@slashpai: all tests passed!
Full PR test history. Your PR dashboard.
@slashpai: Jira Issue OCPBUGS-35480: All pull requests linked via external trackers have merged:
Jira Issue OCPBUGS-35480 has been moved to the MODIFIED state.
@slashpai: new pull request created: #2396