wellcomecollection / platform

Wellcome Collection Digital Platform
https://developers.wellcomecollection.org/
MIT License
48 stars 10 forks source link

Analyse the contents of the libfs02-wellcome-it-com and libfs04-wellcome-it-com buckets #5733

Closed alexwlchan closed 1 year ago

alexwlchan commented 1 year ago

There are two S3 buckets in the NCW Prod account which were created several years ago as backups of on-premise file shares used by Wellcome Collection (last upload August 2019):

Based on an initial analysis, the last time we uploaded anything to these buckets was in August 2019.

Kate W has asked whether we still need these buckets, or whether their contents has now been mirrored to our AWS setup and these backups can be safely deleted. I've done some quick analysis of the buckets, and I think it's fine, but I'd like somebody else to double check my working before I give the all-clear.

alexwlchan commented 1 year ago

What I've learnt so far

If you want to get access to these buckets, you need to raise a ticket with D&T (which will need to go through the CAB process).

I got that access about a week before my final day, so I wasn't able to do any detailed analysis. My guess is that there's nothing here worth keeping and we could drop both the buckets.

My analysis

The tl;dr is that these buckets seem to be a mix of Goobi/Preservica storage.

We've already migrated everything out of Preservica and confirmed we got it into the storage service. We could repeat that process, but it feels like overkill.

alexwlchan commented 1 year ago

I’ve run this past Ashley and the team; we’re happy to delete both of these.

I’m going to leave this ticket open until Kate confirms the buckets are gone, so we don’t lose it.

pollecuttn commented 1 year ago

5/10/2023 NP followed up with Kate.

pollecuttn commented 1 year ago

5/10/2023 confirmed deleted by Kate