dora-team / fourkeys

Platform for monitoring the four key software delivery metrics of software delivery
Apache License 2.0
2.17k stars 595 forks source link

Daily Failure Change rate is not reflecting in Grafana Dashboard [CI/CD is Gitlab] #285

Open kevin-vakayil opened 2 years ago

kevin-vakayil commented 2 years ago

Our CI/CD is Gitlab Right now we are facing a issue with Daily Change Failure Rate graph in Grafana. The vaule seems to be coming in the Big query. Screenshot (730)

In the following screen shot the incidents table data from our gitlab is shown. For compaing We also added the mock data. the only diff between the mock data and our sample data is that our changes table is having 2 commits. the commits are same but it is displaying as 2 rows.

Screenshot (731)

Screenshot (730)

The data seems to be properly showing in the bigQuery, but it is not displaying in Grafana. Screenshot (733)

We are following the steps from the documentation but the data is not displaying(daily change failure rate) and Thanks for support for the previous issue raised.

kevin-vakayil commented 2 years ago

our CI/CD is GITLAB After looking into the query we where able to find that the deployments table is not having the commit sha of failed deployment. So the the value in the d.changes is not matching SUM(IF(i.incident_id is NULL, 0, 1)) / COUNT(DISTINCT change_id) as change_fail_rate FROM four_keys.deployments d, d.changes the above code DAILY CHANGE FAILURE RATE uses the d.changes from deployments table.

Below are the steps we did and the place where issue is occuring is also explained

  1. the pipeline is failed.
  2. we rasie the issue with root cause along with the commit sha.
  3. we change the code and commit it.
  4. We are running a new pipeline.with new commit sha. [old pipeline cant be runned agin after new commit ] [so the new sha comes in deployments table as main commit field and the changes field is different in incidents table ] [ In the mock data the main commit field in deployment table as well as changes field in Incidents table are same which is used to get the data for DAILY CHANGE FAILURE RATE.] [As The main commit in deployment after solving the issue is not the same as the root cause while the issue is raised seems to be the problem]
  5. the pipeline ran successfully
  6. the issue is closed

For getting the DAILY CHANGE FAILURE RATE The changes need to be there. As we are running a new pipeline the commit sha is different and the value of DAILY CHANGE FAILURE RATE is 0.0.