-
The project table does not include how many forks/stars the project has at a given time. It would be helpful to include this.
-
Introduce a new attribute that determines if a repository is active or not. Deciding the status may involve seeing how far off the last commit to the repository is from today (or the date of the GHTor…
-
`create_active_range` function in `./attributes/management/main.py` uses the `commits` table to find the latest commit associated with a project. The GHTorrent [documentation](http://www.ghtorrent.org…
-
Hello,
My organization just utilized the GitHub Mirror (more specifically the GHTorrent SQL Data from 4/2) to generate a dataset detailing programming language popularity by country. In order to do t…
-
Hello,
This is regarding the schema given here:
http://ghtorrent.org/files/schema.pdf
The id key is used in many tables and I am unable to understand it clearly. Consider the following cases:
1. The…
-
# Paper Link
http://dl.acm.org/citation.cfm?id=2639506&CFID=609851366&CFTOKEN=28229506
# Data Link
http://ghtorrent.org/relational.html
-
data set: 106MB
# MSR'14 Mining Challenge
The International Working Conference on Mining Software Repositories (MSR) has hosted a mining challenge since 2006. With this challenge we call upon everyon…
-
Link to paper http://dl.acm.org/ft_gateway.cfm?id=2568260&ftid=1467975&dwn=1&CFID=609849406&CFTOKEN=18730546
Link to data (Link to a paper that contains the data) -
[G. Gousios. The GHTorrent dataset …
rahlk updated
9 years ago
-
"We have lots of experience with generating useful reports" -> Link to Georgios's site.
@gousiosg, could you provide me a stable link to a nice sample report of yours?
-
It would be a good experiment to see how well this works. Presumably datashape discovery and such might be more challenging in some cases. Presumably complex Joins and such might not be available. …