cncf / devstats

📈CNCF-created tool for analyzing and graphing developer contributions
https://devstats.cncf.io
Apache License 2.0
61 stars 22 forks source link

Two different results about the same median time from open to merge for PRs #18

Closed Jeanine-tw closed 11 months ago

Jeanine-tw commented 11 months ago

On page 1, the avg value of median time (7 days MA) is 21.09 hours. While page 2 shows the same metric is 4.80 hours.

I'm confused by the two different numbers for the same metric. Is there any difference between them?

截屏2023-08-04 15 51 31 截屏2023-08-04 15 50 46
lukaszgryglicki commented 11 months ago

Hi, will take a look today or on Monday, thanks for reporting.

lukaszgryglicki commented 11 months ago

On it, comparing SQLs for those two metrics and trying to figure out what cause the difference, but when you look at the last value of the median in both charts - they are the same, and the first with clause used in them gives the same data for all-time PRs, so there must be something else - will update when I know what.

lukaszgryglicki commented 11 months ago

I see what the problem is: We have a table gha_pull_requests storing PRs data and we also have gha_issues_pull_requests (I won't dive into details but every PR on GitHub is also an issue - means part of its data is stored on the issue object, in that gha_issues_pull_requests we store connection from PR to its corresponding issue data). For some reason unknown for me many PRs do not have connection to issues, see this:

hwameistor=# select count(distinct pull_request_id) from gha_issues_pull_requests;
 count 
-------
   339
(1 row)

hwameistor=# select count(distinct id) from gha_pull_requests;
 count 
-------
  1146
(1 row)

hwameistor=# select count(distinct id) from gha_pull_requests where id not in (select pull_request_id from gha_issues_pull_requests);
 count 
-------
   807
(1 row)

One of those two dashboards metrics takes into account only PRs that also have issue data, while another just takes PRs. I can fix it by removing reference to the issues part as this particular dashboard is not using any issue data, will create a fix soon, metrics used by both dashboards are (just FYI if u r interested):

lukaszgryglicki commented 11 months ago

This is fixed, see: 1st and 2nd:

Zrzut ekranu 2023-08-7 o 10 50 07 Zrzut ekranu 2023-08-7 o 10 50 17
Jeanine-tw commented 11 months ago

thanks for your efforts!