Configuration in a docker-compose.yaml not fully working as expected

Srungweber commented 1 month ago

I have this setup using docker compose:

services: db: image: mysql:8.4.2 restart: always environment: MYSQL_DATABASE: 'xxxxxxxx'

Password for root access
  MYSQL_ROOT_PASSWORD: ${DB_PASSWORD}
  MYSQL_ROOT_HOST: "%"
ports:
  # <Port exposed> : <MySQL Port running inside container>
  - '3306:3306'
expose:
  # Opens port 3306 on the container
  - '3306'
  # Where our data will be persisted
volumes:
  - my-db:/var/lib/mysql
backup: image: databack/mysql-backup:master restart: always command: dump volumes:

~/database/backups:/db_backup environment: DB_DUMP_TARGET: /db_backup DB_USER: root DB_PASS: ${DB_PASSWORD} DB_DUMP_CRON: 0 DB_SERVER: db DB_DUMP_BY_SCHEMA: false DB_NAMES: volumes: my-db:

Every full hour a backup is being successfully created.

"DB_DUMP_BY_SCHEMA: false"
- Expected: A single .sql file containing all the data.
- Actual: The backup contains a new sql file for each schema.
"DB_NAMES:" docs
- Expected: all schemas incl. system schemas are in the backup
- Actual: Only the user created schemas are being backed up

I also tried with this environment definition without any luck:

environment:

DB_DUMP_TARGET=/db_backup

DB_USER=root

DB_PASS=${DB_PASSWORD}

DB_DUMP_CRON=40

DB_SERVER=db

DB_DUMP_BY_SCHEMA=false

DB_NAMES=

Hardware a M1 Mac with macOS 14.1

deitch commented 1 month ago

Fairly or not, the docs for DB_DUMP_BY_SCHEMA are leftovers from the old script-based version. The current version always dumps by schema into separate files, then tars them together (and gzips, configurable).

If you provide specific db names to backup it does those, else it finds all of them here, which does:

show databases
exclude any that match one of "information_schema", "performance_schema", "sys", "mysql"

The original version of mysql-backup always had a single dump file, then people made really good arguments for having separate backup files. When v1.x came along, we couldn't think of any good reason to keep (and support, and detect) both formats, so it all became file-per-schema.

If there is a good case to be made for needing a single .sql file, we can look at it. What would not be covered by calling restore on the 5 files extracted from a tgz file vs a single one?

Srungweber commented 1 month ago

Thank you for providing this insight. To be honest I did not check if the docs are still valid. For the single sql file: We don't have a case to make here. Having a separate file per schema is great for us.

For the schemas to exclude the system ones is more of an issue to us as we don't want to recreate all the user account in a recovery scenario. Is this a no-no? It leaves us with the option of explicitly naming these system schemas in the configuration. Which in and of itself is okay. But this also means that all user schemas need to be named in the configuration, no? So every time a schema is being added someone needs to remember to add the new schema to the backup configuration.

deitch commented 1 month ago

No it's actually a decent suggestion, to have an option to back up the system tables as well. Leave this open, will try and get to to. If you've got golang skills and want to contribute, that's always welcome.

Historically, if I recall correctly, mysqldump excludes system tables, which is why this does too. I have a vague memory of seeing mysqldump source code.

Srungweber commented 1 month ago

Maybe you don't need to. I could easily create a second job that explicitly names the system schemas that I want in a backup and have the other run without DB_NAMES so it covers all current and future user created schemas.

deitch commented 1 month ago

Yeah, but that is a painful UX and requires lots of extra work. We will get to it.

deitch commented 1 month ago

The easy part for this - CLI flag/config/env var, plus controls to ensure that those tables do not get filtered out - is done on a branch. The hard part, what to do with it, is creating some complexity.

Is there any point to dumping the tables in these databases?

performance_schema is transient, information on the current performance, and not helpful to restore to another database. Actually, I wonder if it would break something if you did. I wouldn't want to try.
information_schema is transient as well, just in-memory metadata
sys which is mostly views into those above and related things, also, my understanding, is not meant to be dumped and restored
mysql - this is a somewhat reasonable candidate, with its users and other information

So, what are we actually talking about?

Srungweber commented 1 month ago

From where I stand today, I would have a use case for mysql.user Maybe I'm missing something but that would be my initial request. Let me play devil's advocate for a moment, though: User data can also be stored in a sql file and made available as part of the docker-compose up procedure. This would eliminate the need to include the user table in backup. Downside to this: when passwords get rolled, the user sql file would need to be updated, too.

You see, I'm still a bit on the fence about my own idea. Do you remember what the original argument was to exclude the system schemas?

deitch commented 1 month ago

Do you remember what the original argument was to exclude the system schemas?

Sure, mysqldump didn't do it. 😁

The original version of this was just a wrapper around mysqldump, so it inherited a lot. As we switched to a native version, we wanted to change only those things that really made sense to do so.

Also, implementing each table type was real work in the native version. Worth it, but work nonetheless.

To be fair, mysqldump didn't back them up for good reasons, mainly those listed above. If you read their docs, they say that if you want to dump those, ask for them explicitly. IIRC, they also recommend against it (but working off of memory).

User data can also be stored in a sql file and made available as part of the docker-compose up procedure. This would eliminate the need to include the user table in backup

You could do something like this using post-backup scripts, if you really wanted to. In any case, all database backups are snapshots in time, valid only at the moment the backup was taken.

But I wouldn't want to make any knowledge of that inherent to mysql-backup itself. It is too fraught with danger.

I see a few other paths:

we can have --system-database, but dump just sys. I am pasting a copy of everything in sys in another message
we can have --users to dump just sys.users. That gets a little bit messy, but not terribly so.

I could not comment on the risks of trying to restore mysql table to another database. I just do not know. Tech tools are like knives, cut food well, but they have edges.

deitch commented 1 month ago

mysql> show full tables;
+------------------------------------------------------+------------+
| Tables_in_mysql                                      | Table_type |
+------------------------------------------------------+------------+
| columns_priv                                         | BASE TABLE |
| component                                            | BASE TABLE |
| db                                                   | BASE TABLE |
| default_roles                                        | BASE TABLE |
| engine_cost                                          | BASE TABLE |
| func                                                 | BASE TABLE |
| general_log                                          | BASE TABLE |
| global_grants                                        | BASE TABLE |
| gtid_executed                                        | BASE TABLE |
| help_category                                        | BASE TABLE |
| help_keyword                                         | BASE TABLE |
| help_relation                                        | BASE TABLE |
| help_topic                                           | BASE TABLE |
| innodb_index_stats                                   | BASE TABLE |
| innodb_table_stats                                   | BASE TABLE |
| ndb_binlog_index                                     | BASE TABLE |
| password_history                                     | BASE TABLE |
| plugin                                               | BASE TABLE |
| procs_priv                                           | BASE TABLE |
| proxies_priv                                         | BASE TABLE |
| replication_asynchronous_connection_failover         | BASE TABLE |
| replication_asynchronous_connection_failover_managed | BASE TABLE |
| replication_group_configuration_version              | BASE TABLE |
| replication_group_member_actions                     | BASE TABLE |
| role_edges                                           | BASE TABLE |
| server_cost                                          | BASE TABLE |
| servers                                              | BASE TABLE |
| slave_master_info                                    | BASE TABLE |
| slave_relay_log_info                                 | BASE TABLE |
| slave_worker_info                                    | BASE TABLE |
| slow_log                                             | BASE TABLE |
| tables_priv                                          | BASE TABLE |
| time_zone                                            | BASE TABLE |
| time_zone_leap_second                                | BASE TABLE |
| time_zone_name                                       | BASE TABLE |
| time_zone_transition                                 | BASE TABLE |
| time_zone_transition_type                            | BASE TABLE |
| user                                                 | BASE TABLE |
+------------------------------------------------------+------------+
38 rows in set (0.00 sec)

deitch commented 1 month ago

At least they all are base tables.

databacker / mysql-backup

Configuration in a docker-compose.yaml not fully working as expected #347

Password for root access