yobasystems / alpine-mariadb

MariaDB running on Alpine Linux [Docker]
https://hub.docker.com/r/yobasystems/alpine-mariadb/
237 stars 71 forks source link

No ability to auto-create database with a different charset/collation #30

Closed RussellNS closed 5 years ago

RussellNS commented 5 years ago

First off, I love the container. Thank you.

Also, I'm relatively new to 'Issues' vs 'Enhancement Requests.' So If I'm doing this wrong, please let me know. This is an issue, but also an enhancement request.

My issue is that my use case requires an initial database to be auto-created that's 'utf8mb4' for the character set and the default 'utf8mb4_general_ci' for collation. When I run a container using the example provided:

--character-set-server=utf8mb4 --collation-server=utf8mb4_unicode_ci

This creates a 'mysql' database that meets the requirements, but the additional auto-created database is still 'utf8' and 'utf8_general_ci'.

MariaDB [(none)]> SELECT SCHEMA_NAME 'database', default_character_set_name 'charset', DEFAULT_COLLATION_NAME 'collation' FROM information_schema.SCHEMATA;
+--------------------+---------+--------------------+
| database           | charset | collation          |
+--------------------+---------+--------------------+
| information_schema | utf8    | utf8_general_ci    |
| mysql              | utf8mb4 | utf8mb4_unicode_ci |
| performance_schema | utf8    | utf8_general_ci    |
| my_database        | utf8    | utf8_general_ci    |
+--------------------+---------+--------------------+
4 rows in set (0.001 sec)

The code that auto-creates the additional database appears to be in 'run.sh':

    if [ "$MYSQL_DATABASE" != "" ]; then
        echo "[i] Creating database: $MYSQL_DATABASE"
        echo "CREATE DATABASE IF NOT EXISTS \`$MYSQL_DATABASE\` CHARACTER SET utf8 COLLATE utf8_general_ci;" >> $tfile

        if [ "$MYSQL_USER" != "" ]; then
        echo "[i] Creating user: $MYSQL_USER with password $MYSQL_PASSWORD"
        echo "GRANT ALL ON \`$MYSQL_DATABASE\`.* to '$MYSQL_USER'@'%' IDENTIFIED BY '$MYSQL_PASSWORD';" >> $tfile
        fi
    fi

Would it be possible to change this code to allow the user to provide the character set and collation desired using two environment variables?

Proposed Example 'docker-compose.yml':

    mysql:
      image: yobasystems/alpine-mariadb
      environment:
        MYSQL_ROOT_PASSWORD: hguyFtgfR4r9R4r76
        MYSQL_DATABASE: wordpressdb
        MYSQL_USER: wordpressuser
        MYSQL_PASSWORD: hguyFt6S95dgfR4ryb
        MYSQL_CHARSET: utf8mb4
        MYSQL_COLLATION: utf8mb4_general_ci
      expose:
        - "3306"
      volumes:
        - /data/example/mysql:/var/lib/mysql
      restart: always

Proposed Example 'run.sh' Code Change:

    if [ "$MYSQL_DATABASE" != "" ]; then
        echo "[i] Creating database: $MYSQL_DATABASE"
        if [ "$MYSQL_CHARSET" != "" ] && [ "$MYSQL_COLLATION" != "" ]; then
            echo "[i] with character set [$MYSQL_CHARSET] and collation [$MYSQL_COLLATION]"
            echo "CREATE DATABASE IF NOT EXISTS \`$MYSQL_DATABASE\` CHARACTER SET $MYSQL_CHARSET COLLATE $MYSQL_COLLATION;" >> $tfile
        else
            echo "[i] with character set: 'utf8' and collation: 'utf8_general_ci'"
            echo "CREATE DATABASE IF NOT EXISTS \`$MYSQL_DATABASE\` CHARACTER SET utf8 COLLATE utf8_general_ci;" >> $tfile
        fi

        if [ "$MYSQL_USER" != "" ]; then
        echo "[i] Creating user: $MYSQL_USER with password $MYSQL_PASSWORD"
        echo "GRANT ALL ON \`$MYSQL_DATABASE\`.* to '$MYSQL_USER'@'%' IDENTIFIED BY '$MYSQL_PASSWORD';" >> $tfile
        fi
    fi

This would default to 'utf8' and 'utf8_general_ci' unless a user provided both values, but also allow a user to provide any of these:

MariaDB [(none)]> SHOW CHARACTER SET;
+----------+-----------------------------+---------------------+--------+
| Charset  | Description                 | Default collation   | Maxlen |
+----------+-----------------------------+---------------------+--------+
| big5     | Big5 Traditional Chinese    | big5_chinese_ci     |      2 |
| dec8     | DEC West European           | dec8_swedish_ci     |      1 |
| cp850    | DOS West European           | cp850_general_ci    |      1 |
| hp8      | HP West European            | hp8_english_ci      |      1 |
| koi8r    | KOI8-R Relcom Russian       | koi8r_general_ci    |      1 |
| latin1   | cp1252 West European        | latin1_swedish_ci   |      1 |
| latin2   | ISO 8859-2 Central European | latin2_general_ci   |      1 |
| swe7     | 7bit Swedish                | swe7_swedish_ci     |      1 |
| ascii    | US ASCII                    | ascii_general_ci    |      1 |
| ujis     | EUC-JP Japanese             | ujis_japanese_ci    |      3 |
| sjis     | Shift-JIS Japanese          | sjis_japanese_ci    |      2 |
| hebrew   | ISO 8859-8 Hebrew           | hebrew_general_ci   |      1 |
| tis620   | TIS620 Thai                 | tis620_thai_ci      |      1 |
| euckr    | EUC-KR Korean               | euckr_korean_ci     |      2 |
| koi8u    | KOI8-U Ukrainian            | koi8u_general_ci    |      1 |
| gb2312   | GB2312 Simplified Chinese   | gb2312_chinese_ci   |      2 |
| greek    | ISO 8859-7 Greek            | greek_general_ci    |      1 |
| cp1250   | Windows Central European    | cp1250_general_ci   |      1 |
| gbk      | GBK Simplified Chinese      | gbk_chinese_ci      |      2 |
| latin5   | ISO 8859-9 Turkish          | latin5_turkish_ci   |      1 |
| armscii8 | ARMSCII-8 Armenian          | armscii8_general_ci |      1 |
| utf8     | UTF-8 Unicode               | utf8_general_ci     |      3 |
| ucs2     | UCS-2 Unicode               | ucs2_general_ci     |      2 |
| cp866    | DOS Russian                 | cp866_general_ci    |      1 |
| keybcs2  | DOS Kamenicky Czech-Slovak  | keybcs2_general_ci  |      1 |
| macce    | Mac Central European        | macce_general_ci    |      1 |
| macroman | Mac West European           | macroman_general_ci |      1 |
| cp852    | DOS Central European        | cp852_general_ci    |      1 |
| latin7   | ISO 8859-13 Baltic          | latin7_general_ci   |      1 |
| utf8mb4  | UTF-8 Unicode               | utf8mb4_general_ci  |      4 |
| cp1251   | Windows Cyrillic            | cp1251_general_ci   |      1 |
| utf16    | UTF-16 Unicode              | utf16_general_ci    |      4 |
| utf16le  | UTF-16LE Unicode            | utf16le_general_ci  |      4 |
| cp1256   | Windows Arabic              | cp1256_general_ci   |      1 |
| cp1257   | Windows Baltic              | cp1257_general_ci   |      1 |
| utf32    | UTF-32 Unicode              | utf32_general_ci    |      4 |
| binary   | Binary pseudo charset       | binary              |      1 |
| geostd8  | GEOSTD8 Georgian            | geostd8_general_ci  |      1 |
| cp932    | SJIS for Windows Japanese   | cp932_japanese_ci   |      2 |
| eucjpms  | UJIS for Windows Japanese   | eucjpms_japanese_ci |      3 |
+----------+-----------------------------+---------------------+--------+
40 rows in set (0.000 sec)

Also, for the use case that a user simply flubs up the values they provide this way, a warning could be placed in the README.md that invalid values could cause the container to become unstable or not start at all.

I can make the change myself using the Dockerfile and 'run.sh' you've provided, but I thought I'd ask here first.

Thank you for your attention.

EDIT: Typos

dominictayloruk commented 5 years ago

Fixed in https://github.com/yobasystems/alpine-mariadb/commit/72544f562e2f685ab08eb7b1c01e0977e447e193