sclorg / mysql-container

MySQL container images based on Red Hat Software Collections and intended for OpenShift and general usage. Users can choose between Red Hat Enterprise Linux, Fedora, and CentOS based images.
http://softwarecollections.org
Apache License 2.0
128 stars 200 forks source link

bz1505037: Introduce DEBUG_IGNORE_SCRIPT_FAILURES option to rescue from failure of the setup script #199

Closed nak3 closed 6 years ago

nak3 commented 7 years ago

Since run-mysqld sets set -eu, if commands returned non 0 result inside passwd-change.sh, mysql container does not start. Due to this, if mysql command starts failing inside passwd-change.sh, mysql does not run and we cannot debug it.

This patch introduces set +e inside the script and ignore the user/password creation error, so users can debug mysql after it starts running.

centos-ci commented 7 years ago

Can one of the admins verify this patch?

rhscl-bot commented 7 years ago

Can one of the admins verify this patch?

hhorak commented 7 years ago

Thanks @nak3 for reporting. Do you know about some specific case when some command in the script returns non-zero exit code? I'd rather try fix that case, than ignore the failure by setting set +e, since at this point I don't see what command could return non-exit code and it would be taken as fine.

nak3 commented 7 years ago

Hi @hhorak

I'm sorry, I should have informed the bug ticket. Below is the exact ticket. We have not found the reason why mysql returned Access denied for user 'root'@'localhost' yet. (We are debugging it by an image which we built by ourselves as mysql never become Running.)

Bug 1505037 - mysql pod is not debug friendly when passwd-change.sh failed https://bugzilla.redhat.com/show_bug.cgi?id=1505037

$ oc logs mysql-1-x95v8 -c mysql

  OUTPUT:
  ---> 14:16:58     Processing MySQL configuration files ...

   ... snip ...

  2017-10-18T14:16:59.149617Z 0 [Note] /opt/rh/rh-mysql57/root/usr/libexec/mysqld: ready for connections.
  Version: '5.7.16'  socket: '/tmp/mysql.sock'  port: 0  MySQL Community Server (GPL)
  2017-10-18T14:16:59.535076Z 2 [Note] Access denied for user 'UNKNOWN_USER'@'localhost' (using password: NO)
  ---> 14:16:59     MySQL started successfully
  ---> 14:16:59     Setting passwords ...
  2017-10-18T14:16:59.558903Z 3 [Note] Access denied for user 'root'@'localhost' (using password: NO)
  ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: NO)
nak3 commented 7 years ago

Another option I came up with that:

in 5.x/root/usr/bin/run-mysqld

if [[ ! -v DEBUG_MODE ]]; then
  set -eu
fi
hhorak commented 7 years ago

I understand better now, although I don't think that just turning on the debug mode should change behaviour of the image. That could lead to a situation where the image fails when the debug mode is disabled, but would work fine when debug mode is enabled, which wouldn't help debugging much.

PR #202 tries to help debugging by simply putting more information, so users should see what failed when debugging mode is enabled (different env. var is used, since CONTAINER_DEBUG was already used in the image).

hhorak commented 7 years ago

@nak3 would something like #202 address your needs?

nak3 commented 7 years ago

@hhorak Thank you. But unfortunately it does not address my needs. Although verbose message would be helpful, mysql pod will still not able to be RUNNING. We probably must run mysql command to fix the issue.

For example, if the deubg log informed of some grant was wrong, how we fix it? mysql pod is still not running, so we cannot access to the database/table.

hhorak commented 7 years ago

Ah, I understand better now. So, it's not only about see where the issue happened, but also how to get it into working state by some troubleshooting techniques, which, in terms of container means to change some starting scripts.

For that, I believe, the new concept of making the image extensible could be helpful. For example, when you see some issue in passwd-change.sh file (that is provided by the image by default), that this script is not capable of fixing by its own, you can provide an own modified passwd-change.sh file by either bind-mounting a volume or using secrets (or creating a new image using source-to-image), simply putting the changed file into /opt/app-root/src/init/passwd-change.sh and this modified file will be used instead of the original one. This should work if there is no other way forward.

However, even your solution to not set the set -eu makes sense to me, sometimes it can help to overcome some random failure of a single command and can be the easiest way forward. However, one must understands that such change can potentially lead to undefined behavior and that it's done on user's responsibility. Thus, I'd not call this variable DEBUG_MODE, but rather something like DEBUG_IGNORE_SCRIPT_FAILURES.. And we should be explicit that running a container with this set is not supported and proper functionality is not guaranteed.

nak3 commented 7 years ago

For that, I believe, the new concept of making the image extensible could be helpful. For example, when you see some issue in passwd-change.sh file (that is provided by the image by default), that this script is not capable of fixing by its own, you can provide an own modified passwd-change.sh file by either bind-mounting a volume or using secrets (or creating a new image using source-to-image), simply putting the changed file into /opt/app-root/src/init/passwd-change.sh and this modified file will be used instead of the original one. This should work if there is no other way forward.

Yes, as I have reported on the bugzilla, we have already reached the solution of re-building the image.

From https://bugzilla.redhat.com/show_bug.cgi?id=1505037

Actual results:
- We had to rebuild MySQL image and put it on public registry.

However, there are so many restrictions that come up.

... etc

However, even your solution to not set the set -eu makes sense to me, sometimes it can help to overcome some random failure of a single command and can be the easiest way forward. However, one must understands that such change can potentially lead to undefined behavior and that it's done on user's responsibility. Thus, I'd not call this variable DEBUG_MODE, but rather something like DEBUG_IGNORE_SCRIPT_FAILURES.. And we should be explicit that running a container with this set is not supported and proper functionality is not guaranteed.

OK, DEBUG_IGNORE_SCRIPT_FAILURES is better. I know the risk of unset -eu. However, database outage is more critical and we want to use DEBUG_IGNORE_SCRIPT_FAILURES if we have the back up data.

nak3 commented 6 years ago

Updated this PR to introduc DEBUG_IGNORE_SCRIPT_FAILURES option to rescue from failure of the setup script.

cc// @hhorak

rhscl-bot commented 6 years ago

Can one of the admins verify this patch?

centos-ci commented 6 years ago

Can one of the admins verify this patch?

hhorak commented 6 years ago

[test]

nak3 commented 6 years ago

ping @hhorak

hhorak commented 6 years ago

Sorry for delay.. I think it's all right, let's merge it.