Closed alexeagle closed 2 years ago
This would be a lot easier to review if it were split into two commits - one with the code and the second with the initial data import. An initial observation - the latest tag author and committer includes name and email addresses; do we need to do some kind of privacy impact assessment for this as, whilst this is publicly available data, we are aggregating PII data without explicit consent.
@jsharpe you're right, let's just review the code for now, no one cares to review commits that are just data dumps. In fact we'll want some automation so that those are just scheduled GitHub actions that generate trivial PRs.
I don't think we're obligated to have any privacy policy for this org.
This includes a config file listing all known repos, which we will add to over time, a schema and JS file to provide default values, a Bash script that scrapes data for each ruleset, and finally a checked in copy of the raw data for each ruleset.
In the future we'll want some tooling to present the data or export it into other tooling like a spreadsheet similar to https://docs.aspect.dev/stats