Open dtolnay opened 1 year ago
Thanks for the report! I'm not too familiar with that part of the code, so I'll require a bit of time to fully investigate it.
I think there is actually something much more wrong going on, which I haven't found yet. But I am skeptical that the 1 bug I explained fully explains the current state of the worldwide_public page, because:
I haven't been able to find any indication of such errors being logged in the GitHub Actions logs; and
the worldwide_public page is so wrong. The code I linked sets a limit of 10 retries throughout the entire sequence of queries, so in the absolute worst scenario, one would only expect several dozen users to get missed due to this continue Pages
-related bug. I think there are instead hundreds to thousands of users missing in what's currently shown on the worldwide_public page.
Notice how united_states_public currently shows that 1000 USA users have followers >=1069. Meanwhile worldwide_public currently shows only 1000 worldwide users have followers >=972. That makes no sense because if 1000 USA users have >=1069 followers, then somebody with only 972 followers can't possibly be in the worldwide top 1000 by followers.
I have observed multiple times that users temporarily disappear from the site, when they should be meeting the cutoffs for followers and contributions. For example this recently affected my account in https://github.com/ashkulz/committers.top/commit/a23a53898085ea89d8113d7a24fd9cb35efcfad5. Right now, I appear on https://committers.top/united_states_public, but incorrectly do not appear on https://committers.top/worldwide_public. Other users who have fewer followers and fewer contributions than me appear on that page, which should not be possible.
From examining the code, I have found one bug that could cause this behavior.
In this function:
https://github.com/ashkulz/committers.top/blob/079aa6e8c8c0c70601d5dc3c3fd97dd2e9d6b181/github/github.go#L64
we perform queries for
followers:<N
where N is the minimum follower count observed so far. After getting back each page of results, the minFollowerCount is appropriately updated. However if an error occurs during any request andcontinue Pages
is executed, there may still be users with follower count equal to minFollowerCount who have not yet been returned.For example if the first query returns users with follower count 900, 890, 889, 889, 887 then minFollowerCount is set to 887. Ordinarily the next request would visit the second page of the same query, using GitHub's
after:
pagination, and may include additional users with an 887 follower count. However if an error occurs during the second request, after a retry the next query will execute withfollowers:<887
and any other users with follower count equal to 887 will have been lost.