colinmollenhour / mariadb-galera-swarm

MariaDb Galera Cluster container based on official mariadb image which can auto-bootstrap and recover cluster state.
https://hub.docker.com/r/colinmollenhour/mariadb-galera-swarm
Apache License 2.0
217 stars 102 forks source link

Use of bootstrap.sql to import existing sql files fails on large database. #45

Closed alphaDev23 closed 5 years ago

alphaDev23 commented 6 years ago

While this is related to the ERROR 1105 issue described in https://github.com/colinmollenhour/mariadb-galera-swarm/issues/44, this current issue has a different mechanism. In this case, there is no error and tables fail to appear in the database without a log message.

The following code in start.sh is related to the issue, that is the import code upon startup:

    for f in /docker-entrypoint-initdb.d/*; do
            case "$f" in
                    *.sh)     echo "$0: running $f"; . "$f" ;;
                    *.sql)    echo "$0: appending $f"; cat "$f" >> /tmp/bootstrap.sql ;;
                    *.sql.gz) echo "$0: appending $f"; gunzip -c "$f" >> /tmp/bootstrap.sql ;;
                    *)        echo "$0: ignoring $f" ;;
            esac
            echo
    done

Commenting the code above and adding code similar to that is the mariadb container code (in https://github.com/docker-library/mariadb/blob/master/docker-entrypoint.sh) after mysqld starts, e.g. after 'gosu mysql mysqld.sh --console...', appears correct the issue although I do not know if this creates other issues:

for f in /docker-entrypoint-initdb.d/; do case "$f" in .sh) echo "$0: running $f"; . "$f" ;; .sql) echo "$0: running $f"; "${mysql[@]}" < "$f"; echo ;; .sql.gz) echo "$0: running $f"; gunzip -c "$f" | "${mysql[@]}"; echo ;; *) echo "$0: ignoring $f" ;; esac echo done

Is there a reason why a bootstrap file is used in the current version versus importing directly via code similar to mariadb's code?

colinmollenhour commented 6 years ago

It's been quite a while so I don't remember exactly why but in general the startup process is much different due to the multi-node architecture so at the time this code is run the server is not yet running.

Any idea why that method doesn't work for you?

It might not be that hard to move that code somewhere else, though.

alphaDev23 commented 6 years ago

I did not have time to determine why the --init-file method did not work on larger databases but at this point I assume that it is some idiosyncrasy similar to the one in issue #44 (and that was a time sync to debug in part because there was a character escaping issue which convoluted things further). Hopefully someone on github may be able to provide insight but it may be best to stay away from the method for the purposes of importing existing files given the nonsensical and arbitrary query limitations such as 20,0000 bytes.

Moving the code to the location mentioned in my post above does seem to work but it has only been tested with one database. If anyone else can provide insight or run tests with different databases to ensure that nothing else breaks, that would help confirm the change.

Are there any tests planned for this code base? If yes, initializing a cluster with databases of arbitrary size and content and then verifying that the original is the same as the cluster after initialization will catch any discrepancies.

colinmollenhour commented 6 years ago

I wonder if you are better off seeding the database through the app server on startup rather than the database server?

I don't have time to help you make modifications right now but if you submit pull requests I'll look at them and consider them. Thanks!

alphaDev23 commented 6 years ago

I appreciate your feedback. My time is extremely limited and will not have time to further code this issue. If you want to modify the code base with the above suggestions, feel free to. If the current code is not revised though, please cite the limitations in the readme so that other users are not burdened with hours of debugging which has already been done here. It should obviously be left as an issue but that does not substitute for the readme because users rarely scour through issue posts especially as the number of issues grow.

Also, please note that the current implementation breaks compatibility from the mariadb container. Users coming from that container code will expect the same functionality unless alerted to the differences.