EESSI / filesystem-layer

Filesystem layer of the EESSI project
https://eessi.github.io/docs/filesystem_layer
GNU General Public License v2.0
6 stars 16 forks source link

cleanup of tarball bucket to fix automated ingestion #185

Open bedroge opened 4 months ago

bedroge commented 4 months ago

We recently seem to have hit some kind of limit with the bucket: the automated ingestion would only pick up the first 500 tarballs it would find in the S3 bucket, meaning that new skylake tarballs would not be found (as they are picked up alphabetically, and skylake is at the bottom of the list). We can probably work around it by doing some kind of pagination, and we were considering reimplementing this workflow anyway, but for now I've done a quick workaround by setting up a new bucket (software.eessi.io-archive), syncing all files from the existing to the archive bucket, and removing them from the existing one.

We can do that more often by running the following:

export AWS_SECRET_ACCESS_KEY=XXX AWS_ACCESS_KEY_ID=YYY
aws s3 sync s3://software.eessi.io-2023.06 s3://software.eessi.io-archive
aws s3 rm  s3://software.eessi.io-2023.06 --recursive

This should only be run if there are no open PRs in the staging repo and if no tarballs were just uploaded to the bucket (i.e. no open PRs with bot: deploy label), otherwise they will be lost.

bedroge commented 2 months ago

As the number of tarballs was getting quite large again, I've just run this cleanup operation again.