cncf / devstats

📈CNCF-created tool for analyzing and graphing developer contributions
https://devstats.cncf.io
Apache License 2.0
68 stars 25 forks source link

[bug] Incorrect total commits for user #36

Closed yussufsh closed 11 months ago

yussufsh commented 1 year ago

A couple of weeks before I had added my affiliation via https://github.com/cncf/gitdm/pull/169

Now it is showing my name with the correct company and country. But the total commits shown is not correct—more filtered view here.

Compared to other users in that list the last decade's stats show full contributions for them. For me, I think it shows only after a specific date (I do not know what).

Would be good if you could check specifically for my GH user-id: yussufsh

lukaszgryglicki commented 1 year ago

Will TAL on Monday, thanks for reporting.

lukaszgryglicki commented 12 months ago

Checking this.

lukaszgryglicki commented 12 months ago

It looks like there are 156 contributions for your user, dashboard shows 156 (in DevStats), DB investigation shows the same:

gha=# select * from gha_actors where login = 'yussufsh';
    id    |  login   |     name      | country_id | sex | sex_prob | tz | tz_offset | country_name | age 
----------+----------+---------------+------------+-----+----------+----+-----------+--------------+-----
 17543270 | yussufsh | Yussuf Shaikh | in         |     |          |    |           | India        |    
(1 row)

gha=# select * from gha_actors_affiliations where actor_id = 17543270;
 actor_id |                company_name                 |       dt_from       |        dt_to        |            original_company_name            | source 
----------+---------------------------------------------+---------------------+---------------------+---------------------------------------------+--------
 17543270 | International Business Machines Corporation | 1900-01-01 00:00:00 | 2100-01-01 00:00:00 | International Business Machines Corporation | user
(1 row)

gha=# select count(*) from gha_events where dup_actor_login = 'yussufsh';
 count 
-------
   238
(1 row)

gha=# select count(*) from gha_events where actor_id = 17543270;
 count 
-------
   238
(1 row)

gha=# select count(*) from gha_events where dup_actor_login = 'yussufsh' and type in ('IssuesEvent', 'PullRequestEvent', 'PushEvent', 'CommitCommentEvent', 'IssueCommentEvent', 'PullRequestReviewCommentEvent', 'PullRequestReviewEvent');
 count 
-------
   156
(1 row)

gha=#

gha=# select min(created_at), max(created_at), count(*) from gha_events where dup_actor_login = 'yussufsh' and type in ('IssuesEvent', 'PullRequestEvent', 'PushEvent', 'CommitCommentEvent', 'IssueCommentEvent', 'PullRequestReviewCommentEvent', 'PullRequestReviewEvent');
         min         |         max         | count 
---------------------+---------------------+-------
 2023-02-06 11:14:58 | 2023-09-29 02:39:41 |   156
(1 row)

gha=#  

Looks like the minimum date for your contribution is:

2023-02-06 11:14:58

I cannot do anything, dashboards show exactle the same what I found manually on the DB, the only possible way to proceed would be point which activity is missing, like for example give me link to some GitHub contribution that you made before 2023-02-06 or it can be a git commit - I will then investiage if I can find it or not... @yussufsh

lukaszgryglicki commented 12 months ago

Commits alone:

gha=# select count(*) from gha_commits where author_id = 17543270;
 count 
-------
    28
(1 row)

gha=# select count(*) from gha_commits where committer_id = 17543270;
 count 
-------
    16
(1 row)
lukaszgryglicki commented 12 months ago

If I check all projects (not only Kubernetes), I can see this:

allprj=# select r.repo_group, min(e.created_at), max(e.created_at), count(distinct e.id) from gha_events e, gha_repos r where e.repo_id = r.id and e.dup_actor_login = 'yussufsh' and e.type in ('IssuesEvent', 'PullRequestEvent', 'PushEvent', 'CommitCommentEvent', 'IssueCommentEvent', 'PullRequestReviewCommentEvent', 'PullRequestReviewEvent') group by r.repo_group;
 repo_group |         min         |         max         | count 
------------+---------------------+---------------------+-------
 CNCF       | 2023-09-18 10:29:53 | 2023-10-06 13:31:16 |     3
 Kubernetes | 2023-02-06 11:14:58 | 2023-10-04 02:47:36 |   156
(2 rows)

You only contribute to Kubernetes and also CNCF (just 3 contribution), minimum date is again 2023-02-06.

Per repos:

allprj=# select r.name, min(e.created_at), max(e.created_at), count(distinct e.id) from gha_events e, gha_repos r where e.repo_id = r.id and e.dup_actor_login = 'yussufsh' and e.type in ('IssuesEvent', 'PullRequestEvent', 'PushEvent', 'CommitCommentEvent', 'IssueCommentEvent', 'PullRequestReviewCommentEvent', 'PullRequestReviewEvent') group by r.name order by 4 desc;
                     name                      |         min         |         max         | count 
-----------------------------------------------+---------------------+---------------------+-------
 kubernetes-sigs/ibm-powervs-block-csi-driver  | 2023-05-03 17:32:37 | 2023-10-04 02:47:36 |   118
 kubernetes-sigs/cluster-api-provider-ibmcloud | 2023-02-06 11:14:58 | 2023-03-17 06:48:27 |    26
 kubernetes/test-infra                         | 2023-08-05 06:13:41 | 2023-08-25 11:48:41 |     5
 kubernetes/                                   | 2023-08-05 06:13:41 | 2023-08-25 11:48:41 |     5
 kubernetes/k8s.io                             | 2023-08-25 04:55:27 | 2023-08-26 07:19:54 |     4
 kubernetes-sigs/windows-operational-readiness | 2023-09-11 03:52:15 | 2023-09-11 03:57:30 |     3
 cncf/devstats                                 | 2023-10-06 13:07:17 | 2023-10-06 13:31:16 |     2
 kubernetes/org                                | 2023-08-23 14:55:45 | 2023-08-23 14:55:45 |     1
 cncf/gitdm                                    | 2023-09-18 10:29:53 | 2023-09-18 10:29:53 |     1
(9 rows)
lukaszgryglicki commented 12 months ago

I'm putting to blocked, till I have some example of data that we are missing, for now - dashbaords show correct results when checked against the data we have.

yussufsh commented 11 months ago

Thanks for looking @lukaszgryglicki I am satisfied with the data.

lukaszgryglicki commented 11 months ago

OK, thanks for reporting @yussufsh