github-data-tools
Description
A handy set of tools which first gets data from GitHub Archive website. Secondly, we unpack that data and add .json files to a mongodb database. Finally, we analyze those GitHub Events to create list of developers and their attributes which are important for typical OSS and COIN analysis.
Requirements
- Python with all installed packages
- Mongodb with GHA events or at least downloaded GHA (import first)
Dimensionsal output data
Dimensions which we want to describe for a developer:
- Number of following; [FollowEvent]
- Number of followers; [FollowEvent]
- How many developers in projects created by him [PushEvent] [IssuesEvent] [PullRequestEvent] [GollumEvent]
- How many collaborators in projects created by him [TeamAddEvent] [MemberEvent]
- In how many repos, not created by him, he is a collaborator [TeamAddEvent] [MemberEvent]
- In how many repos, not created by him, he is a contributor [PushEvent] [IssuesEvent] [PullRequestEvent] [GollumEvent]
- Code quality globally and in a repo [apart from GHA]
- Time spent in a repo [PushEvent]
- Number of commits per skill [PushEvent]
- number of commits globally
- number of commits in a repo
- ratio for i/ii
- Number of discussions in a repo [CommitCommentEvent] [IssueCommentEvent] [PullRequestReviewCommentEvent]
- Number of closed 'feature issues' by him globally [IssuesEvent]
- Number of closed 'bug issues' by him globally [IssuesEvent]
- Time from the first commit [PushEvent]
- Time from the last commit [PushEvent]
- Time between commits [PushEvent]
- Usuall time of commiting [PushEvent]
Find out more at:
http://wikiteams.github.io/github-data-tools