freezingsaddles / freezing-compose

The Docker Compose structure for running the application suite.
4 stars 2 forks source link

Recover 2023 database #29

Closed obscurerichard closed 10 months ago

obscurerichard commented 10 months ago

Due to a series of unfortunate events we lost the 2023 backup of the Freezing Saddles database.

@obscurerichard did restore a snapshot of the Aurora server and it is waiting for a dump of the SQL database to make it more convenient to work on the 2023 data.

obscurerichard commented 10 months ago

Comments from Slack are here:

merlin 1 month ago Database backup is freezing-2023-11-21.sql, year cleanup is year-start-2024.log, 2024 should be mostly g2g.

74 replies

obscurerichard 1 month ago Excellent. I didn’t see that you had fully cleared the database so I did it again, munis the users. See I see you have already commented on https://github.com/freezingsaddles/freezing-web/pull/233 My change should have been harmless, I re-added the 2024 team. Thanks for taking care of this!

#233 Do not clean out athletes for 2024 We announced registration before we had cleaned out the athletes. So let's avoid clearing the athletes table. Benefits • people may not have to register again on the site using Strava Drawbacks • The database will swell with the ranks of folks who had registered
last year but are not playing this year and their rides • These users will ultimately be shown in the Frends of BAFS leaderboard
for 2024 (and beyond if we keep not resetting this) Comments 1 https://github.com/[freezingsaddles/freezing-web](https://github.com/freezingsaddles/freezing-web)|freezingsaddles/freezing-webfreezingsaddles/freezing-web | Nov 21st | Added by GitHub

merlin 1 month ago the newly-registered users are all gone now

obscurerichard 1 month ago No, they are not, mysql> select count() from athletes; +----------+ | count() | +----------+ | 33 | +----------+ 1 row in set (0.00 sec)

merlin 1 month ago ah, their team association must be gone

merlin 1 month ago Screenshot 2023-11-21 at 2.45.54 PM.png

Screenshot 2023-11-21 at 2.45.54 PM.png

merlin 1 month ago formerly said 33

obscurerichard 1 month ago I did whack the teams table

obscurerichard 1 month ago But I backed things up (again)

obscurerichard 1 month ago So we could restore just that table if we wanted now.

obscurerichard 1 month ago Do you think we should?

merlin 1 month ago i think i can just update the team id

merlin 1 month ago i think i can just update the team id

obscurerichard 1 month ago [centos@ip-172-31-86-46 ~]$ mysql-non-interactive < /tmp/x mysql: [Warning] Using a password on the command line interface can be insecure. count() 0 count() 33 count() 1 count() 0 count(*) 0 cleaning out rides cleaning out rides cleaning out teams cleaning out teams cleaning out ride_geo cleaning out ride_geo cleaning out ride_weather cleaning out ride_weather

obscurerichard 1 month ago That’s my run log from when I cleared things.

merlin 1 month ago mysql> update athletes set team_id = 1190487; Query OK, 33 rows affected (0.01 sec) Rows matched: 33 Changed: 33 Warnings: 0

merlin 1 month ago apparently not enough..

obscurerichard 1 month ago I think I overwrote your database backup from today before the clean-out though, ugh.

obscurerichard 1 month ago I could just restore the whole db…

merlin 1 month ago i don’t remember the data model well enough to know why adding the team_id isn’t enough

merlin 1 month ago we could call it a planned DR exercise

obscurerichard 1 month ago ugh

obscurerichard 1 month ago I really don’t like having lost the 2023 db either

merlin 1 month ago no, that is undesirable

obscurerichard 1 month ago and… there are no RDS snapshots.

obscurerichard 1 month ago hmm. I’m doing the restore from the last backup I made. I had to edit the file a bit to get it to work.

obscurerichard 1 month ago Did you happen to save a copy of the 2023 database locally?

merlin 1 month ago Unfortunately I didn’t. I have a CSV of the athlete points, that’s it.

obscurerichard 1 month ago There’s superuser-only constructions in the dump that have to be manually cleaned out :disappointed:

merlin 1 month ago so..

merlin 1 month ago let’s not do this

merlin 1 month ago a new person has registered

obscurerichard 1 month ago It’s a bit late

merlin 1 month ago oh

merlin 1 month ago i’m confused by the things. too much is forgotten.

obscurerichard 1 month ago The last athlete was: | 127711973 | Strava Athlete | NULL | 3277b9ee6e17dc137bf132f8138202bc388c1a12 | Strava Athlete | https://d3nn82uaxijpm6.cloudfront.net/assets/avatar/athlete/large-800a7033cc92b2a5548399e26b1ef42414dd1a9cb13b99454222d38d58fd28ef.png | 1700609949 | 4e2fe34861a68e91a14024d44a12b8c78116e599 |

(508 B) https://d3nn82uaxijpm6.cloudfront.net/assets/avatar/athlete/large-800a7033cc92b2a5548399e26b1ef42414dd1a9cb13b99454222d38d58fd28ef.png

obscurerichard 1 month ago maybe not :disappointed:

merlin 1 month ago need to pull up code to understand this. Screenshot 2023-11-21 at 3.09.01 PM.png

Screenshot 2023-11-21 at 3.09.01 PM.png

merlin 1 month ago lbd_athletes

obscurerichard 1 month ago Sorry! We could probably figure out who just registered.

merlin 1 month ago the views are stale. i don’t know mysql well enough to know if that’s a thing

merlin 1 month ago create view lbd_athletes as select a.id, a.name, a.display_name, a.team_id from athletes a join teams T on T.id=a.team_id where not T.leaderboard_exclude

merlin 1 month ago mysql> select * from lbd_athletes; +----+------+--------------+---------+ | id | name | display_name | team_id | +----+------+--------------+---------+ | 1 | 1 | 1 | 1 | +----+------+--------------+---------+ 1 row in set (0.00 sec)

merlin 1 month ago mysql> select a.id, a.name, a.display_name, a.team_id from athletes a join teams T on T.id=a.team_id where not T.leaderboard_exclude; +-----------+------------------------------------------+------------------------------------------+---------+ | id | name | display_name | team_id | +-----------+------------------------------------------+------------------------------------------+---------+ | 31138 | James Hekel | James Hekel | 1190487 | | 175612 | Paul Smith | Paul Smith | 1190487 | | 300685 | Chris Randall | Chris Randall | 1190487 | | 649587 | Chris Readinger | Chris Readinger | 1190487 |

merlin 1 month ago

mysql> alter view lbd_athletes as select a.id, a.name, a.display_name, a.team_id from athletes a join teams T on T.id=a.team_id where not T.leaderboard_exclude; Query OK, 0 rows affected (0.02 sec)

merlin 1 month ago so that view is back in action, no idea what it was doing before.

obscurerichard 1 month ago ugh

obscurerichard 1 month ago Yeah I think the views are screwed up

merlin 1 month ago i guess the restore makes views sad. perhaps they will self-correct on inserts

obscurerichard 1 month ago I am not sure. I think maybe there’s something with the views in the dump that they are not getting recreated properly. I’m going to try to correct it with an edited extract from the dump.

obscurerichard 1 month ago Shit. The views are all messed up in the latest dump.

merlin 1 month ago looks pretty simple to copy/paste them from the codebase. most are create or replace view.

obscurerichard 1 month ago Yeah, there’s a relatively easy way to do this from a backup extract.

obscurerichard 1 month ago It did save these but with some other noise the Aurora doesn’t like

obscurerichard 1 month ago ok I think I have fixed them all

obscurerichard 1 month ago see backups/freezing-2023-11-21-viewsonly.sql :+1: 1

obscurerichard 1 month ago I extracted and adapted that from the database dump - it had all the statements needed to correct it but covered in funky comments

obscurerichard 1 month ago Oh! So I wasn’t as completely negligent as I thought about backups. I was looking at the wrong AWS account. We do have RDS snapshot backups: 2 files

image.png

image.png

obscurerichard 1 month ago I even have a manual snapshot backup from the end of last season

obscurerichard 1 month ago So in the absolute worst case we have more options for recovery at our disposal.

merlin 1 month ago Nice

obscurerichard 1 month ago I’d forgotten setting this up properly

obscurerichard 1 month ago I’m going to do a restore to a new instance so we can get that .sql format dump from it for the 2023 end-of-season. And to test the restore. I’m not going to point anything on the server to it though.

merlin 1 month ago Sounds good. I presume an RCA will be forthcoming.

obscurerichard 1 month ago sigh

obscurerichard 1 month ago RCA==PEBKAC

obscurerichard 1 month ago I also saved the snapshot from the wee hours of this morning arn:aws:rds:us-east-1:299196842131:cluster-snapshot:unmanaged-pre-2023-season-backup-2023-11-21-08-24

obscurerichard 1 month ago so it will stick around until further notice

obscurerichard 1 month ago I’m going to set this aside for now and come back to it later tonight or this weekend, I think I have stopped the bleeding from the self-inflicted wound and we have backups after all… for everything except for maybe 1 user registration.

merlin 1 month ago I think that registration may have been a phantom. I saw that weird view count of 1 and assumed a new person

merlin 1 month ago But it was probably the 1/1/1/1 row

obscurerichard 1 month ago That is the definition of the view that happens early in the backup file

obscurerichard 1 month ago The views get re-defined right at the end of the mysqldump file with more explicit SQL code. But for some reason those were not working when I ran them through mysql

obscurerichard 1 month ago I think they had some superuser-only stuff that I trimmed out and they didn’t work as-is

obscurerichard 1 month ago I had to rephrase them as CREATE OR REPLACE VIEW statements instead

obscurerichard 1 month ago without all the funky comment metadata

obscurerichard commented 10 months ago

I ended up being able to dump the database from the restored Aurora server, fs-2023-restore.c7hti2ehau6i.us-east-1.rds.amazonaws.com.

I then post-processed it and clarified the docs for loading a dump on a local dev environment in #30 - this is resolved.

The text-format database dumps from the 2023 season are on the server now:

[centos@ip-172-31-86-46 compose]$ ls -l ~/backups/freezing-2023-11-20*
-rw-r--r--. 1 centos centos 333777341 Dec 31 22:20 /home/centos/backups/freezing-2023-11-20-fixed-for-root-load-2.sql
-rw-rw-r--. 1 centos centos 333777395 Dec 31 20:51 /home/centos/backups/freezing-2023-11-20.sql