Open MathewBiddle opened 7 months ago
Maybe something like this?
import os
import pandas as pd
from github import Github
from safer import open
try:
with open(os.path.expanduser("~/.ghoauth"), "r") as f:
access_token = f.read()
access_token = str(access_token).strip()
except FileNotFoundError:
access_token = None
g = Github(access_token)
user = g.get_user("ioos")
repos = user.get_repos()
ioos_gh = {}
for repo in user.get_repos():
print(repo.name)
if repo.fork is False:
stars = repo.stargazers_count
contributors = repo.get_contributors()
contributors_contribution = {
contributor.name: contributor.contributions
for contributor in contributors
}
ioos_gh.update(
{
repo.name: {
"stars": stars,
"forks": repo.forks,
"contributors": contributors_contribution,
},
}
)
df = pd.DataFrame(ioos_gh).T.sort_values(by="stars", ascending=False)
You will need a GH token to run it but it should not require elevated permissions, just read should do it. Here is what I got from the code above:
df.head(n=20)
stars forks contributors
compliance-checker 96 51 [{'Benjamin Adams': 713}, {'Luke Campbell': 33...
erddapy 75 29 [{'Filipe': 740}, {'Vini Salazar': 83}, {'Call...
bio_data_guide 43 18 [{'Mathew Biddle': 222}, {'Tylar': 81}, {'Bret...
ioos_qc 39 22 [{'Kyle Wilcox': 199}, {'Filipe': 71}, {'Luke ...
pyoos 34 33 [{'Filipe': 52}, {'Dave Foster': 24}, {'Emilio...
conda-recipes 20 29 [{'Filipe': 1186}, {'Rich Signell': 289}, {'IO...
notebooks_demos 19 19 [{'Filipe': 774}, {'Jennifer Bosch Webster': 7...
gsoc 16 9 [{'Mathew Biddle': 28}, {'Micah Wengren': 26},...
thredds_crawler 16 22 [{'Kyle Wilcox': 63}, {'Luke Campbell': 15}, {...
Cloud-Sandbox 9 11 [{'Patrick Tripp': 113}, {'Jonathan Joyce': 9}...
ioos-python-package-skeleton 9 9 [{'Filipe': 113}, {None: 3}, {'Alex Kerney': 2...
BioData-Training-Workshop 8 8 [{'Don Setiawan': 41}, {'Ben Best': 17}, {'Fil...
ioos_code_lab 8 7 [{'Filipe': 1140}, {'Mathew Biddle': 96}, {'Je...
ioosngdac 8 18 [{'John Kerfoot': 80}, {'Luke Campbell': 20}, ...
erddap-gold-standard 8 15 [{'Mathew Biddle': 16}, {'Kyle Wilcox': 6}, {'...
system-test 7 14 [{'Bob Fratantonio': 69}, {'Filipe': 68}, {'Ri...
ckanext-ioos-theme 7 14 [{'Benjamin Adams': 202}, {'Luke Campbell': 10...
soundcoop 6 2 [{'Clea Parcerisas': 15}, {None: 6}, {'Carlos ...
glider-dac 6 12 [{'Benjamin Adams': 295}, {'Luke Campbell': 20...
service-monitor 6 13 [{'Luke Campbell': 304}, {'Benjamin Adams': 16...
I like what you've done here @ocefpaf! Maybe quantifying the number of contributors too. But, that should be easy with the list you developed.
FYI, I just ran across this https://opensource.guide/metrics/
This is interesting too https://chaoss.community/software/
we can get a lot of stuff from github's advanced search:
https://github.com/search?q=org%3Aioos&type=repositories&ref=advsearch
I like what you've done here @ocefpaf! Maybe quantifying the number of contributors too. But, that should be easy with the list you developed.
Yes. we can do something like:
contributors = []
for repo, row in df.iterrows():
s = pd.Series(row["contributors"])
s.name = repo
contributors.append(s)
index = pd.concat(contributors, axis=1).sum(axis=1).sort_values(ascending=False).index
contributors_per_repo = pd.concat(contributors, axis=1).reindex(index)
contributors_per_repo.sum(axis=1)
FYI, I just ran across this https://opensource.guide/metrics/ This is interesting too https://chaoss.community/software/
Those are a really nice resources! I knew about CHAOSS but nor the opensource.guide.
we can get a lot of stuff from github's advanced search
If you are just browsing, yes. But we can get all that info grammatically with PyGitHub and create tables, etc. The repo
object in the main loop has all the info and, if you are using an elevated token, you can even do fancy things like write/create, but we don't need that for the metrics.
also could be worthwhile to look at the number of participants in issues https://gist.github.com/ocefpaf/2ed11e4c977adfe3ffeb5eef9f576c1e
While they might not be directly contributing to a project, they are participating in the conversation.
While they might not be directly contributing to a project, they are participating in the conversation.
That indeed made a few repos popup, like ioos-atn-data
and bio_data_guide
. See the last two cells in https://gist.github.com/ocefpaf/11a7c4832b23dc3978a1a3fb20783988
Can we develop a metric to quantify the impact of the IOOS GitHub organization?
related to #26 but expanding further into our non-packaged repositories (e.g. documentation).
Number of forks, stars, active contributors, etc.