Open Richard-Hansen opened 3 years ago
➤ Charles Overbeck commented:
Brainstorm:
Probably run a cron job on deployer or jump server
After restoring prod DB, we need to get the dockstore-bot token in there, so GitHub apps will continue to work. Eventually token(s) for the auth user for smoke tests, when we get that working. Depending on the smoke tests, we may need more than just the Dockstore token (at least the GitHub token as well).
As a convenience for developers, maybe we’d want to restore their tokens daily as well (probably not).
Token censoring logic is currently in funkbot, probably don’t want to have staging go through Slack. Can staging access funkbot directly?
pg_dump has an --exclude-table-data option; should prod do two backups to s3, one with and one without tokens?
Give staging r/o access to prod DB and have it run pg_dump? (don’t like that)
Need to reindex ElasticSearch after DB restore
➤ Charles Overbeck commented:
pg-dump -U postgres --no-owner --exclude-table-data=token
As a convenience for developers, maybe we’d want to restore their tokens daily as well (probably not).
probably not (reduce attack service, retain less sensitive info around)
pg_dump has an --exclude-table-data option; should prod do two backups to s3, one with and one without tokens?
could work
Nearly doubles our storage costs (which are pretty small).
would not have to if we set an aggressive retention period to clean out this new bucket (say after a week) since the other copy would be the real backup
➤ Charles Overbeck commented:
{quote}> Nearly doubles our storage costs (which are pretty small).
would not have to if we set an aggressive retention period to clean out this new bucket (say after a week) since the other copy would be the real backup{quote}
Another idea is that we could write the censored data to the same the s3 object every time.
test
test 2
➤ Melaina Legaspi commented:
test 3
➤ Charles Overbeck commented:
Will need to run migrations as well.
To support auth smoke tests (https://github.com/dockstore/dockstore-documentation/pull/173/commits/a45ef2a27cb8391f6a17804230d7640983e474e5), the tokens of some (or all) users should be restored whenever the Staging/Dev DB is updated.
Technically, for the auth tests to function, only the token for DockstoreTestUser4
needs to be restored, but it may be easier to restore all tokens simultaneously.
This PR supplies helper SSM documents: https://github.com/dockstore/dockstore-deploy/pull/364
➤ Charles Overbeck commented:
To verify, instructions from PR:
Needs this fix: https://ucsc-cgl.atlassian.net/browse/SEAB-4044 ( https://ucsc-cgl.atlassian.net/browse/SEAB-4044|smart-link ) , which is only in dev.
➤ Steve Von Worley commented:
To verify, I contacted a Dockstore admin to ensure that I had permissions to run the SSM documents as described below. He thought that I should be authorized to run them. I attempted to run the first document, and the AWS console displayed the following error User: arn:aws:iam::312767926603:user/svonworl is not authorized to perform: ssm:SendCommand I do not believe that I have permissions to verify this ticket.
Denis mentioned that if I did not have the appropriate permissions, I should comment on this ticket: https://ucsc-cgl.atlassian.net/browse/SEAB-4233 ( https://ucsc-cgl.atlassian.net/browse/SEAB-4233|smart-link ) I will do so.
➤ Denis Yuen commented:
Ran into issues as an admin
The output is not … awesome
➤ Steve Von Worley commented:
I took this out of my review ticket, since I sense there’s the possibility it’ll be a while before it’s reviewable.
➤ Denis Yuen commented:
For the first document, was able to look into /var/log/amazon/ssm, found that the issue is
{quote}2022-05-02 15:20:13 INFO [ssm-agent-worker] [MessagingDeliveryService] Sending reply { "additionalInfo": { "agent": { "lang": "en-US", "name": "amazon-ssm-agent", "os": "", "osver": "1", "ver": "" }, "dateTime": "2022-05-02T15:20:13.506Z", "runId": "", "runtimeStatusCounts": null }, "documentStatus": "Failed", "documentTraceOutput": "Input contains invalid parameters [/DeploymentConfig/CensoredBucket]", "runtimeStatus": null }{quote}
Confirmed that the parameter doesn’t exist, not sure if I should manually override
Not able to locate the debug output for the second step but maybe it is a related to the first.
Is your feature request related to a problem? Please describe. Update the Staging stack's database with the most recent version of the Production database at a specific frequency (probably daily/nightly). This will give better testing coverage of database backups.
┆Issue is synchronized with this Jira Story ┆fixVersions: Dockstore 2.X ┆friendlyId: SEAB-3409 ┆sprint: Sprint 72-Yellowspotted Catsha ┆taskType: Story