Option to provide list of repositories via a configuration file

dbschenker / contribution-checker

A tool to identify certain authors in repositories and analyze their commits. Helpful to check if and to which degree an organization's members have contributed to third-party projects.

Apache License 2.0

3 stars 1 forks source link

Option to provide list of repositories via a configuration file #5

Open mxmehl opened 1 year ago

mxmehl commented 1 year ago

To get a better overview about a set of repositories we need a way to specify multiple repositories. I would suggest an option -f, --file, which takes the name of a configuration file which contains all the information needed to do the analysis. This would be the expression for filtering by author email and a list of repository URLs.

To make this efficient, we might want to cache repositories at a defined location, so that the tool only transfers the minimal amount of data and doesn't check out all repositories from scratch with every invocation (see #4).

The report should then show a summary instead of the results for an individual repository.

mxmehl commented 1 year ago

I wonder what the summary for a multi-repo check should look like. Wouldn't it be better to just print all reports as a list of dicts, and also wrap the output for a single repo in a list? This way, it could easily be analysed by subsequent steps or tools.

cornelius commented 1 year ago

For a summary I would expect a report like this:

Repository	Number of ORG contributors	Total ORG contributions	Latest contribution
https://github.com/example/one	3	47	2023-08-01
https://github.com/example/two	1	2	2018-12-24 (more than a year ago)
https://github.com/example/three	0	0	no contributions

Where "ORG" would be a label or title for the specific contributions the checker filtered for.

I would maybe also like to see some additional fields such as "percentage of ORG contributions" or "average ORG activity per week" or "average total activity per week".