Closed peerau closed 4 months ago
The line there is:
CREATE DATABASE /*!32312 IF NOT EXISTS*/ ` + "`{{.Database}}`" + ` /*!40100 DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci */ /*!80016 DEFAULT ENCRYPTION='N' */;
USE ` + "`{{.Database}}`;"
Does this affect utf8mb4_0900_ai_ci
for COLLATE only? Or utf8mb4
for DEFAULT CHARACTER SET
as well?
I assume you tested it. Did you end up with a successful dump, but then could not restore the file? And if that is the case, and you manually edited the file, did it then work? Which version(s) of MariaDB?
I changed the COLLATE to uca1400_, yes the default charset is fine - dumping and restoring from MariaDB 10.17
The latest is 11.x (11.3.2). Can you confirm that it works there as well?
I think eventually we may need a separate standalone binary for mariadb. I would like to avoid that for as long as possible. If we can tell - based on querying the database - if it is mariadb or mysql, we can determine which template to use.
Even better would be if there is somewhere to determine what the collation is, i.e. capabilities rather than among.
@peerau do you know how we can accomplish those reliably?
SELECT @@character_set_database, @@collation_database;
seems to work across both mysql and mariadb
a quick parity check with a docker setup gives me:
/ # echo 'SELECT @@character_set_database, @@collation_database;\q' | mysql --user=root --password=password --host mysql-80 mysql
utf8mb4 utf8mb4_0900_ai_ci
/ # echo 'SELECT @@character_set_database, @@collation_database;\q' | mysql --user=root --password=password --host mysql-80 mysql
utf8mb4 utf8mb4_0900_ai_ci
/ # echo 'SELECT @@character_set_database, @@collation_database;\q' | mysql --user=root --password=password --host mysql-83 mysql
utf8mb4 utf8mb4_0900_ai_ci
/ # echo 'SELECT @@character_set_database, @@collation_database;\q' | mysql --user=root --password=password --host mysql-5 mysql
utf8mb4 utf8mb4_0900_ai_ci
/ # echo 'SELECT @@character_set_database, @@collation_database;\q' | mysql --user=root --password=password --host mariadb-lts mysql
utf8mb4 utf8mb4_general_ci
/ # echo 'SELECT @@character_set_database, @@collation_database;\q' | mysql --user=root --password=password --host mariadb-11 mysql
utf8mb4 utf8mb4_general_ci
/ # echo 'SELECT @@character_set_database, @@collation_database;\q' | mysql --user=root --password=password --host mariadb-10 mysql
latin1 latin1_swedish_ci
utf8mb4_general_a?_c? might be a sane default?
I have an initial cut to query the database and get charset and collation, rather than hard-coding.
I did a quick search, it also appears we set charset in a few other places.
In view:
SET @saved_cs_client = @@character_set_client;
/*!50503 SET character_set_client = utf8mb4 */;
In table:
/*!40101 SET @saved_cs_client = @@character_set_client */;
/*!50503 SET character_set_client = utf8mb4 */;
Are those correct? Or should those also be taken from a query?
Those should be set by query, but utf8mb4 is an extremely sane default
On Tue, 30 Apr 2024, 10:22 pm Avi Deitcher, @.***> wrote:
I have an initial cut to query the database and get charset and collation, rather than hard-coding.
I did a quick search, it also appears we set charset in a few other places.
In view:
SET @saved_cs_client = @@character_set_client;/!50503 SET character_set_client = utf8mb4 /;
In table:
/!40101 SET @saved_cs_client = @@character_set_client /;/!50503 SET character_set_client = utf8mb4 /;
Are those correct? Or should those also be taken from a query?
— Reply to this email directly, view it on GitHub https://github.com/databacker/mysql-backup/issues/304#issuecomment-2085248834, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAG6CYVSLS7OIZSDO7XOAITY76HZXAVCNFSM6AAAAABG44WEUSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOBVGI2DQOBTGQ . You are receiving this because you were mentioned.Message ID: @.***>
Which query? For views, we get them from SELECT TABLE_NAME,CHARACTER_SET_CLIENT,COLLATION_CONNECTION FROM INFORMATION_SCHEMA.VIEWS
. There is a parallel for tables, is that the right place?
I did a similar query, but I am not sure it gives what I expect:
mysql> SELECT TABLE_NAME,TABLE_COLLATION FROM INFORMATION_SCHEMA.TABLES where table_name = 't1';
+------------+-------------------+
| TABLE_NAME | TABLE_COLLATION |
+------------+-------------------+
| t1 | latin1_swedish_ci |
+------------+-------------------+
1 row in set (0.00 sec)
mysql> show create table t1;
+-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| t1 | CREATE TABLE `t1` (
`id` int(11) DEFAULT NULL,
`name` varchar(20) DEFAULT NULL,
`d` date DEFAULT NULL,
`t` time DEFAULT NULL,
`dt` datetime DEFAULT NULL,
`ts` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
+-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
On the other hand, if I do a mysqldump
for that database (just the table part):
--
-- Table structure for table `t1`
--
DROP TABLE IF EXISTS `t1`;
/*!40101 SET @saved_cs_client = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `t1` (
`id` int(11) DEFAULT NULL,
`name` varchar(20) DEFAULT NULL,
`d` date DEFAULT NULL,
`t` time DEFAULT NULL,
`dt` datetime DEFAULT NULL,
`ts` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
/*!40101 SET character_set_client = @saved_cs_client */;
--
-- Dumping data for table `t1`
--
LOCK TABLES `t1` WRITE;
/*!40000 ALTER TABLE `t1` DISABLE KEYS */;
INSERT INTO `t1` VALUES (1,'John','2012-11-01','00:15:00','2012-11-01 00:15:00','2012-11-01 00:15:00'),(2,'Jill','2012-11-02','00:16:00','2012-11-02 00:16:00','2012-11-02 00:16:00'),(3,'Sam','2012-11-03','00:17:00','2012-11-03 00:17:00','2012-11-03 00:17:00'),(4,'Sarah','2012-11-04','00:18:00','2012-11-04 00:18:00','2012-11-04 00:18:00');
/*!40000 ALTER TABLE `t1` ENABLE KEYS */;
UNLOCK TABLES;
So we get the CHARSET as part of the SHOW CREATE TABLE
, which is good. The line just before, though:
/*!50503 SET character_set_client = utf8mb4 */;
Where does it come from?
I merged in the fix for this specific issue. But if we can get the remaining utf*
stuff via queries, by all means. Please help direct.
https://github.com/databacker/mysql-backup/blob/master/pkg/database/mysql/dump.go#L92
MariaDB has no collat for utf8mb4_0900_*, instead using uca1400_* (basically a newer version of the same collation)