Open kbroman opened 5 years ago
@kbroman Good idea! For getting started, @jimhester's package itdepends has some code for gathering relevant info like the number of open issues, etc.
https://github.com/jimhester/itdepends/blob/128cd7e42d866c3beaf8ab40ab3cf2e42392208f/R/github.R#L10
👍 Are you thinking about popular, published packages (e.g. tidyverse packages)? If so, I propose something complementary: To provide a package-help group for chirunconf packages. I'll explain more on a separate issue, but the main idea is to help on the technical side so that folks unfamiliar with package-development tools can realize their ideas and share them as a package at the end of the unconf.
@maurolepore I wasn't thinking tidyverse particularly, but rather of trying to crawl github to find repositories that were interesting to people (as indicated by there being issues) but not necessarily kept up.
Maybe related to https://github.com/chirunconf/chirunconf19/issues/32
Great idea! I don't think you'll need to crawl, the GH API is pretty good, and: https://github.com/r-lib/gh exists
Great idea! I don't think you'll need to crawl, the GH API is pretty good, and: https://github.com/r-lib/gh exists
The Search API is more restrictive than the other parts of the API. I often get rejected "for triggering abuse mechanisms" even though I am only querying a few hundred results. Searching every R package on GitHub would take some time.
Another possibility would be to start with this curated list of GitHub R packages. Starting from this list, then we could use the GitHub API to query specific attributes about each repository.
The GH API lets you search repositories by number of help-wanted-issues
, which might be a way to go to find some places to help.
search <- gh::gh("/search/repositories", q = "language:r", sort = "help-wanted-issues", order = "desc")
library(purrr)
library(tibble)
map_chr(search[[3]], "full_name")
#> [1] "UptakeOpenSource/uptasticsearch"
#> [2] "BioinformaticsFMRP/TCGAbiolinks"
#> [3] "cbeleites/hyperSpec"
#> [4] "Huh/PopR_SDGFP"
#> [5] "International-Soil-Radiocarbon-Database/ISRaD"
#> [6] "TommyJones/textmineR"
#> [7] "UptakeOpenSource/pkgnet"
#> [8] "ropenscilabs/learngganimate"
#> [9] "slowkow/ggrepel"
#> [10] "BonnyCI/ci-plunder"
#> [11] "OpenMx/OpenMx"
#> [12] "ProvTools/provR"
#> [13] "statnet/ergm"
#> [14] "retrography/OrientR"
#> [15] "trestletech/plumber"
#> [16] "ices-tools-prod/fisheryO"
#> [17] "jackwasey/icd"
#> [18] "rich-iannone/DiagrammeR"
#> [19] "ropensci/tabulizer"
#> [20] "cloudyr/googleComputeEngineR"
#> [21] "TIBHannover/BacDiveR"
#> [22] "HenrikBengtsson/aroma.seq"
#> [23] "HenrikBengtsson/Wishlist-for-R"
#> [24] "theclue/facebook.S4"
#> [25] "NKU-DSC/RTrainingMaterials"
#> [26] "kevinwolz/hisafer"
#> [27] "fabian-s/tidyfun"
#> [28] "vertica/DistributedR"
#> [29] "ropenscilabs/dataspice"
#> [30] "franzbischoff/tsmp"
Created on 2019-03-08 by the reprex package (v0.2.1)
Of course this depends on the repo owners using that specific tag for issues, which many do not.
maybe we could identify R packages on github and measure something like number of open issues vs time since last commit. might point to packages of interest that need help