databacker / mysql-backup

image to enable automated backups of mysql databases in containers
636 stars 178 forks source link

MariaDB Compatability issue #304

Closed peerau closed 4 months ago

peerau commented 4 months ago

https://github.com/databacker/mysql-backup/blob/master/pkg/database/mysql/dump.go#L92

MariaDB has no collat for utf8mb4_0900_*, instead using uca1400_* (basically a newer version of the same collation)

deitch commented 4 months ago

The line there is:

CREATE DATABASE /*!32312 IF NOT EXISTS*/ ` + "`{{.Database}}`" + ` /*!40100 DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci */ /*!80016 DEFAULT ENCRYPTION='N' */;

USE ` + "`{{.Database}}`;"

Does this affect utf8mb4_0900_ai_ci for COLLATE only? Or utf8mb4 for DEFAULT CHARACTER SET as well?

deitch commented 4 months ago

I assume you tested it. Did you end up with a successful dump, but then could not restore the file? And if that is the case, and you manually edited the file, did it then work? Which version(s) of MariaDB?

peerau commented 4 months ago

I changed the COLLATE to uca1400_, yes the default charset is fine - dumping and restoring from MariaDB 10.17

deitch commented 4 months ago

The latest is 11.x (11.3.2). Can you confirm that it works there as well?

deitch commented 4 months ago

I think eventually we may need a separate standalone binary for mariadb. I would like to avoid that for as long as possible. If we can tell - based on querying the database - if it is mariadb or mysql, we can determine which template to use.

Even better would be if there is somewhere to determine what the collation is, i.e. capabilities rather than among.

@peerau do you know how we can accomplish those reliably?

peerau commented 4 months ago

SELECT @@character_set_database, @@collation_database; seems to work across both mysql and mariadb a quick parity check with a docker setup gives me:

/ # echo 'SELECT @@character_set_database, @@collation_database;\q' | mysql --user=root --password=password --host mysql-80 mysql
utf8mb4 utf8mb4_0900_ai_ci
/ # echo 'SELECT @@character_set_database, @@collation_database;\q' | mysql --user=root --password=password --host mysql-80 mysql
utf8mb4 utf8mb4_0900_ai_ci
/ # echo 'SELECT @@character_set_database, @@collation_database;\q' | mysql --user=root --password=password --host mysql-83 mysql
utf8mb4 utf8mb4_0900_ai_ci
/ # echo 'SELECT @@character_set_database, @@collation_database;\q' | mysql --user=root --password=password --host mysql-5  mysql
utf8mb4 utf8mb4_0900_ai_ci
/ # echo 'SELECT @@character_set_database, @@collation_database;\q' | mysql --user=root --password=password --host mariadb-lts mysql
utf8mb4 utf8mb4_general_ci
/ # echo 'SELECT @@character_set_database, @@collation_database;\q' | mysql --user=root --password=password --host mariadb-11 mysql
utf8mb4 utf8mb4_general_ci
/ # echo 'SELECT @@character_set_database, @@collation_database;\q' | mysql --user=root --password=password --host mariadb-10 mysql
latin1  latin1_swedish_ci
peerau commented 4 months ago

utf8mb4_general_a?_c? might be a sane default?

deitch commented 4 months ago

I have an initial cut to query the database and get charset and collation, rather than hard-coding.

I did a quick search, it also appears we set charset in a few other places.

In view:

SET @saved_cs_client     = @@character_set_client;
/*!50503 SET character_set_client = utf8mb4 */;

In table:

/*!40101 SET @saved_cs_client     = @@character_set_client */;
/*!50503 SET character_set_client = utf8mb4 */;

Are those correct? Or should those also be taken from a query?

peerau commented 4 months ago

Those should be set by query, but utf8mb4 is an extremely sane default

On Tue, 30 Apr 2024, 10:22 pm Avi Deitcher, @.***> wrote:

I have an initial cut to query the database and get charset and collation, rather than hard-coding.

I did a quick search, it also appears we set charset in a few other places.

In view:

SET @saved_cs_client = @@character_set_client;/!50503 SET character_set_client = utf8mb4 /;

In table:

/!40101 SET @saved_cs_client = @@character_set_client /;/!50503 SET character_set_client = utf8mb4 /;

Are those correct? Or should those also be taken from a query?

— Reply to this email directly, view it on GitHub https://github.com/databacker/mysql-backup/issues/304#issuecomment-2085248834, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAG6CYVSLS7OIZSDO7XOAITY76HZXAVCNFSM6AAAAABG44WEUSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOBVGI2DQOBTGQ . You are receiving this because you were mentioned.Message ID: @.***>

deitch commented 4 months ago

Which query? For views, we get them from SELECT TABLE_NAME,CHARACTER_SET_CLIENT,COLLATION_CONNECTION FROM INFORMATION_SCHEMA.VIEWS. There is a parallel for tables, is that the right place?

deitch commented 4 months ago

I did a similar query, but I am not sure it gives what I expect:

mysql> SELECT TABLE_NAME,TABLE_COLLATION FROM INFORMATION_SCHEMA.TABLES where table_name = 't1';
+------------+-------------------+
| TABLE_NAME | TABLE_COLLATION   |
+------------+-------------------+
| t1         | latin1_swedish_ci |
+------------+-------------------+
1 row in set (0.00 sec)

mysql> show create table t1;
+-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                                                                                                                                               |
+-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| t1    | CREATE TABLE `t1` (
  `id` int(11) DEFAULT NULL,
  `name` varchar(20) DEFAULT NULL,
  `d` date DEFAULT NULL,
  `t` time DEFAULT NULL,
  `dt` datetime DEFAULT NULL,
  `ts` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
+-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

On the other hand, if I do a mysqldump for that database (just the table part):

--
-- Table structure for table `t1`
--

DROP TABLE IF EXISTS `t1`;
/*!40101 SET @saved_cs_client     = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `t1` (
  `id` int(11) DEFAULT NULL,
  `name` varchar(20) DEFAULT NULL,
  `d` date DEFAULT NULL,
  `t` time DEFAULT NULL,
  `dt` datetime DEFAULT NULL,
  `ts` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
/*!40101 SET character_set_client = @saved_cs_client */;

--
-- Dumping data for table `t1`
--

LOCK TABLES `t1` WRITE;
/*!40000 ALTER TABLE `t1` DISABLE KEYS */;
INSERT INTO `t1` VALUES (1,'John','2012-11-01','00:15:00','2012-11-01 00:15:00','2012-11-01 00:15:00'),(2,'Jill','2012-11-02','00:16:00','2012-11-02 00:16:00','2012-11-02 00:16:00'),(3,'Sam','2012-11-03','00:17:00','2012-11-03 00:17:00','2012-11-03 00:17:00'),(4,'Sarah','2012-11-04','00:18:00','2012-11-04 00:18:00','2012-11-04 00:18:00');
/*!40000 ALTER TABLE `t1` ENABLE KEYS */;
UNLOCK TABLES;

So we get the CHARSET as part of the SHOW CREATE TABLE , which is good. The line just before, though:

/*!50503 SET character_set_client = utf8mb4 */;

Where does it come from?

deitch commented 4 months ago

I merged in the fix for this specific issue. But if we can get the remaining utf* stuff via queries, by all means. Please help direct.