red-hat-storage / ocs-ci

https://ocs-ci.readthedocs.io/en/latest/
MIT License
109 stars 166 forks source link

test_fio_benchmark.py - the test fails in 4.14 - a fix is needed #8644

Closed ypersky1980 closed 10 months ago

ypersky1980 commented 11 months ago

test_fio_benchmark.py - all test cases failed on both AWS and VMware LSO platforms while running on 4.14. A fix is needed!

The test passed in 4.13 on all the platforms.

VMware LSO: https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/job/qe-deploy-ocs-cluster/28837/testReport/ AWS https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/view/Performance/job/qe-trigger-aws-ipi-3az-rhcos-3m-3w-performance/101/testReport/

pintojoy commented 11 months ago

Issue is because "read_op_per_sec key is missing from ceph status json output

2023-09-07 16:59:18 11:29:18 - MainThread - ocs_ci.utility.utils - INFO - Executing command: oc -n openshift-storage rsh rook-ceph-tools-7b4d6fc4f4-xxss8 ceph status --format json-pretty 2023-09-07 16:59:19 11:29:18 - MainThread - ocs_ci.framework.pytest_customization.ocscilib - ERROR - Failed to collect performance stats 2023-09-07 16:59:19 Traceback (most recent call last): 2023-09-07 16:59:19 File "/home/jenkins/workspace/qe-deploy-ocs-cluster-prod/ocs-ci/ocs_ci/framework/pytest_customization/ocscilib.py", line 731, in pytest_runtest_makereport 2023-09-07 16:59:19 collect_performance_stats(test_case_name) 2023-09-07 16:59:19 File "/home/jenkins/workspace/qe-deploy-ocs-cluster-prod/ocs-ci/ocs_ci/helpers/helpers.py", line 2852, in collect_performance_stats 2023-09-07 16:59:19 iops_percentage = ceph_obj.get_iops_percentage() 2023-09-07 16:59:19 File "/home/jenkins/workspace/qe-deploy-ocs-cluster-prod/ocs-ci/ocs_ci/ocs/cluster.py", line 695, in get_iops_percentage 2023-09-07 16:59:19 iops_in_cluster = self.get_ceph_cluster_iops() 2023-09-07 16:59:19 File "/home/jenkins/workspace/qe-deploy-ocs-cluster-prod/ocs-ci/ocs_ci/ocs/cluster.py", line 675, in get_ceph_cluster_iops 2023-09-07 16:59:19 read_ops = ceph_status["pgmap"]["read_op_per_sec"] 2023-09-07 16:59:19 KeyError: 'read_op_per_sec' 2023-09-07 16:59:19 ERROR 2023-09-07 16:59:19 ----------------------

This was observed in 4.14.0-117 Ceph must gather: "pgmap": { "pgs_by_state": [ { "state_name": "active+clean", "count": 257 } ], "num_pgs": 257, "num_pools": 6, "num_objects": 7762, "data_bytes": 31516810940, "bytes_used": 99156013056, "bytes_avail": 6497913753600, "bytes_total": 6597069766656 },

Whereas in recent build (4.1.4.0-141) I am seeing this key: "pgmap": { "pgs_by_state": [ { "state_name": "active+clean", "count": 145 } ], "num_pgs": 145, "num_pools": 5, "num_objects": 67923, "data_bytes": 170620988180, "bytes_used": 298004226048, "bytes_avail": 3000429993984, "bytes_total": 3298434220032, "read_bytes_sec": 42102652, "write_bytes_sec": 126409307, "read_op_per_sec": 457, "write_op_per_sec": 1032 },