openshift-qe / ocp-qe-perfscale-ci

OpenShift QE PerfScale CI
Apache License 2.0
9 stars 31 forks source link

Cerberus check fails with SSLEOFError #304

Open svetsa-rh opened 1 year ago

svetsa-rh commented 1 year ago

Looks like another issue we hit because of http_proxy/https_proxy failure or missing python packages.

Error reported:

10-20 15:44:58.629 2022-10-20 15:44:58,403 [ERROR] Failed to get the metrics: HTTPSConnectionPool(host='prometheus-k8s-openshift-monitoring.apps.scaleci12-20928.qe.devcluster.openshift.com', port=443): Max retries exceeded with url: /api/v1/query?query=ALERTS%7Balertname%3D%22etcdHighNumberOfLeaderChanges%22%2C+severity%3D%22warning%22%7D (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:1129)')))

See failures for private cluster: https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/scale-ci/job/scale-nightly-regression/486/

https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/scale-ci/job/e2e-benchmarking-multibranch-pipeline/job/cerberus/531/console

Noticed that it failed for non private cluster too: https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/scale-ci/job/scale-nightly-regression/491/

https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/scale-ci/job/e2e-benchmarking-multibranch-pipeline/job/cerberus/549/console

qiliRedHat commented 1 year ago

This could be related to the bug https://issues.redhat.com/browse/OCPBUGS-2557. The router-perf test prior to the cerberus job broke all routes of the cluster. I didn't see this issue on Azure router-perf test https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/scale-ci/job/scale-nightly-regression/490/