microbiomedata / issues

public repo for issues related to NMDC work
1 stars 0 forks source link

Back up production MongoDB database and `mongo` container to HPSS #425

Closed aclum closed 2 days ago

aclum commented 9 months ago

Deliverable this task is associated with

_See Deliverables tab here: https://docs.google.com/spreadsheets/d/1lLQ3wAwJzvxujER6-b4iDnnHx06MYsDbmIBMnu1IOeI/edit#gid=0

RACI

Tag people in their roles

Describe the the task

Criteria for completion

Estimate people time

Completion Date (Goal)

Target Sprint Start & End Dates

Tag Blocker/Contingent upon issues

eecavanna commented 9 months ago

Backing up the database

Regarding this part of the task:

make a permanent backup of the current mongo prod database.

I think @shreddd already does this on a nightly basis. I will confirm with him.

If not, I think I would disable writes (this would affect workflows) and use mongodump to dump the database to some safe storage place (~TBD / I don't know what HPSS is yet~ e.g. HPSS).

Backing up the container

Regarding this other part of the task:

Also backup the container that is running in SPIN for mongo

I assume this is referring to backing up the Kubernetes YAML file that describes—and can be used to recreate—the container. That YAML file can be downloaded from Rancher's web UI.

The container, itself, is an instance of the bitnami/mongodb:6.0.4 Docker image, an off-the-shelf Docker image hosted on DockerHub.

image
aclum commented 9 months ago

We need to make sure that we have a copy of mongo that won't get lifecycled into deletion at some point and confirm that it contains all of the collections we need. @shreddd was the one that mentioned backup up the container, based on what you've said downloading a copy of the YAML file and storing that on HPSS should be sufficient.

eecavanna commented 9 months ago

Ah, OK. Thanks, @aclum. Now I understand why you referred to it as a "permanent" backup in the issue description.

OK, I'll confirm with him what he had in mind regarding "backing up the container."

aclum commented 9 months ago

We are backlogging this. We will do the mongo backup right after we lock the production database. Target is November for updates to prod.

aclum commented 2 days ago

We used the nightly backups.