Is your feature request related to a problem? Please describe.
Fetching archives from S3->GitHub is currently broken, which will break monkeypox-report generation as well. There needs to be a way for users to access archived snapshots of the monkeypox line list for reproducibility purposes. Currently archives are stored in a private S3 bucket whose contents are fetched by a script. Everything other than the cached source URLs are fetched.
Proposed solution
Make the monkeypox bucket public
Additional context
This will make sources also public, which can contain PII. We provide the source URLs in the bucket already, so there can be a case to be made that we are archiving for transparency reasons, but runs risk of potentially hosting PII ourselves. Thoughts @Mougk?
Alternative solution
Move sources fetching to a separate script which puts in a private monkeypox-sources bucket, remove sources from the monkeypox bucket and make it public.
Is your feature request related to a problem? Please describe. Fetching archives from S3->GitHub is currently broken, which will break monkeypox-report generation as well. There needs to be a way for users to access archived snapshots of the monkeypox line list for reproducibility purposes. Currently archives are stored in a private S3 bucket whose contents are fetched by a script. Everything other than the cached source URLs are fetched.
Proposed solution Make the monkeypox bucket public
Additional context This will make sources also public, which can contain PII. We provide the source URLs in the bucket already, so there can be a case to be made that we are archiving for transparency reasons, but runs risk of potentially hosting PII ourselves. Thoughts @Mougk?
Alternative solution Move sources fetching to a separate script which puts in a private monkeypox-sources bucket, remove sources from the monkeypox bucket and make it public.