Open ldangeard-orange opened 6 years ago
We have created an issue in Pivotal Tracker to manage this:
https://www.pivotaltracker.com/story/show/154352380
The labels on this github issue will be updated when the story is started.
hello @ldangeard-orange
I am curious what's the DB size in your env?
hello @GETandSELECT, on my DB Test : 15Gb, but we wish to increase to 50Gb
Hello,
I test with cf_mysql.mysql.startup_timeout=600
, many problem with monit when we stop one noede etc...
So, it's not a good idea to have disable_auto_sst
with false
@ldangeard-orange I'm a little confused. SST should only happen during BOSH pre-start phase, which is not governed by monit
. Are you restarting VMs outside of the bosh director's control? It's very important to stick to using BOSH to manage the VMs, else the pre-start phase will not run.
Am I understanding the problem, and explaining this correctly?
Hello @menicosia, there are several cases:
1) when you restore MariaDB with SHIELD on a node, you have need to stop the other nodes with a monit stop maraidb_ctrl
. After, you execute monit start mariadb_ctrl
, and Galera run in STT
2) when the activity is intense and the gcahe galera is too small, node becomes desynchronized, Galera goes into SST
Hi @ldangeard-orange,
In this case, you'll want to run the pre-start
for the mariadb_ctrl job before the job starts so the node is able to properly perform the SST. The best way to perform this is probably to bosh stop
the individual mysql nodes, perform the restore, then bosh start
the nodes again. This will ensure that the pre-start
scripts are run, and should give plenty of time for the SST to run. If you use monit to stop and start, you'll have to run the pre-start
script manually.
Yeah, in this case you probably want to increase the gcache_size to a value that large enough that you don't run into this problem.
Hi Marco (@menicosia) and Caroline (@ctaymor),
I'm jumping into this issue in order to clarify things. I'm working with Laurent (@ldangeard-orange) in the Orange FR database experts team. Here, people have a very strong expertise on production data services. As a contractor and BOSH expert in France, I'm helping them into pivoting towards authoring BOSH releases and recommendations that benefit or encapsulate their expertise. Currently, we focus on MongoDB, Cassandra and MariaDB (with this cf-mysql-release
).
Here in this issue, the situation that Laurent describes is the following :
swichboard
proxies). Or it can be any reason that is documented in Troubleshooting and Diagnostics chapter of the PCF tile v1.10 documentation.disable_auto_sst
is false
, this late node naturally starts an SST./var/vcap/store/mysql
(the Mysql datadir
) into a .sst
subdirectory, so that /var/vcap/store/mysql
is nearly emptied./var/vcap/store/mysql
whereas this directory should stay empty. And ironically, Monit doesn't even succeed in this, because MariaDB is missing some files that have been moved to the .ssh
subdirectory (see the error message pasted by Laurent in his very first post here in this thread). So, Monit is going to blindly retry and fail several times at restarting MariaDB..sst
subdirectoty is copied back to /var/vcap/store/mysql
. But this destination directory is no more empty, so the SST stops with an error because it is not supposed to clobber any existing database file (the error says something like “datadir is not empty”).I our case, step 1 was triggered by nodes joining back a cluster after a TPCC benchmark. Indeed, we had monit-stopped 2 out of 3 nodes before running a TPCC benchmark on the remaining node. And when the 2 other nodes join the cluster back with a monit start
, then the SST was triggered and the failure scenario happened.
So, it's correct that stopping nodes with bosh stop
and restarting them with bosh start
is better and might work. But the point here is that the SST situation could be triggered naturally in a loaded cluster, as described above. And we have seen this happening in production once in a while.
Normally, Monit should not try to restart the daemon while the system is doing an SST. Maybe the SST script should write its PID into the PID file that Monit is tracking, so that Monit is happy with a live process. But this might require some PRs to be pushed in Galera so that the SST writes its PID in a specific PID file.
What Laurent says is that, waiting for such changes to happen, it would be safer to move back to disable_auto_sst: true
. Indeed, when an SST is required the node just stops with a specific log line (mentioned the Interruptor section of the PCF tile v1.10 documentation) that seems to be added to the SST script specifically. Then an operator needs to be alerted about this log line and operate an SST manually at the proper time, considering production constraints.
For a long-term solution, assuming that this log line is the result of a customized SST script, why not have this script write its own PID in the proper file so that Monit keeps being happy? This is just a guess. Now that I (hopefully) clarified the issue, I let you jump in a suggest any fix you find most relevant.
As a conclusion, I hope this will help in solving this issue, which is a concern for anyone relying on the default values proposed by this BOSH release.
Best, Benjamin
Hello, With the new version develop 36.10 (dev), by default the value of
disable_auto_sst
isFALSE
.However, when you have a big database, the copy with Xtrababckup (SST) need more 60 seconds (
cf_mysql.mysql.startup_timeout
with 60 by default) So monit mariadb_ctrl tries to restart the base, while the transfer is not finished. Many error messages :So I think , it's better to block SST because you need to analyse why your instance is desync.
If you want to maintain
default
value FALSE fordisable_auto_sst
, you need to : . increasestartup_timeout
. monitoring if instance execute SST, for example withwsrep_cluster_conf_id
: http://galeracluster.com/documentation-webpages/monitoringthecluster.html