r-universe-org / help

Support and bug tracker for R-universe
https://docs.r-universe.dev/
9 stars 2 forks source link

Accessing GitHub blackbird_count API for estimating package popularity #501

Open jeroen opened 1 month ago

jeroen commented 1 month ago

In order to get an estimate of the number of scripts that load a certain R package, we want to use GitHub code search using a query like library(jsonlite): https://github.com/search?q=library%28jsonlite%29&type=code

The top-left corner shows the number we are interested in, however sadly this number is not exposed via any GitHub API.

Screenshot 2024-10-01 at 8 14 30 pm

It turns out this is part of a special API called blackbird_count, which currently seems only available via the Webpage.

Screenshot 2024-10-01 at 8 17 01 pm

Currently we scrape this data, but it would be really nice if this API could be exposed as part of the public GitHub API.

colinwm commented 1 month ago

Hi @jeroen, the code search API (https://docs.github.com/en/rest/search/search?apiVersion=2022-11-28#search-code) does have a total_count parameter in the response, which uses the same method as that blackbird_count endpoint to estimate the result count.

The one caveat is that that API doesn't support the same search syntax, so you'll have to write queries that e.g. don't make use of regular expressions or symbol search. But for the example query you showed, it should work the same way.