Is your feature request related to a problem? Please describe.
As we will start aggregating Scorecards prescriptions by project repository revision as present in the new scorecards-v2 BigQuery dataset and possibly create those prescriptions for packages from other ecosystems, we should think about a more scalable solution to have this data available.
The current size of the prescriptions dataset is currently of ~500M, which will largely exceed the recommended GitHub limit of 5GiB for a repository and cause storage and performance issues.
Describe the solution you'd like
Set up a new database (possibly non-relational) or make new Scorecards prescriptions available in a S3 bucket accessed through a webservice.
Is your feature request related to a problem? Please describe. As we will start aggregating Scorecards prescriptions by project repository revision as present in the new
scorecards-v2
BigQuery dataset and possibly create those prescriptions for packages from other ecosystems, we should think about a more scalable solution to have this data available. The current size of theprescriptions
dataset is currently of ~500M, which will largely exceed the recommended GitHub limit of 5GiB for a repository and cause storage and performance issues.Describe the solution you'd like Set up a new database (possibly non-relational) or make new Scorecards prescriptions available in a S3 bucket accessed through a webservice.
Additional context Related to https://github.com/thoth-station/core/issues/440