cloudfoundry / cf-mysql-release

Cloud Foundry MySQL Release
Apache License 2.0
58 stars 106 forks source link

pre-start script should fail fast if persistent disk has not been attached #166

Closed ishustava closed 6 years ago

ishustava commented 7 years ago

Issue

When forgetting to include persistent_disk_type to a cf-mysql deployment manifest, deployment hangs forever (?) in the pre_start stage. This is especially awkward because bosh cancel task XXX also takes a very long time.

How to reproduce

Deploy cf-mysql without persistent_disk_type specified in the instance_groups section. Observe deployment hang at the Updating instance... stage. On the vm:

$ monit summary
/var/vcap/bosh/etc/monitrc:8: Warning: include files not found '/var/vcap/monit/job/*.monitrc'
The Monit daemon 5.2.5 uptime: 5m

System 'system_localhost'           running

mysql.err.log:

2017-05-23 22:26:00 7fbd3534d780 InnoDB: Error: Write to file ./ib_logfile1 failed at offset 653262848.
InnoDB: 1048576 bytes should have been written, only 733184 were written.
InnoDB: Operating system error number 28.
InnoDB: Check that your OS and file system support files of this size.
InnoDB: Check also that the disk is not full or a disk quota exceeded.
InnoDB: Error number 28 means 'No space left on device'.
InnoDB: Some operating system error numbers are described at
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/operating-system-error-codes.html
2017-05-23 22:26:00 140450618201984 [ERROR] InnoDB: Cannot set log file ./ib_logfile1 to size 1024 MB
2017-05-23 22:26:00 140450618201984 [ERROR] Plugin 'InnoDB' init function returned error.
2017-05-23 22:26:00 140450618201984 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.

I was able to resolve this issue by killing the pre-start process on the vm.

Desired outcome

It would be nice if the pre-start script could detect that the persistent disk has not been attached and fail fast.

cf-gitbot commented 7 years ago

We have created an issue in Pivotal Tracker to manage this:

https://www.pivotaltracker.com/story/show/145971959

The labels on this github issue will be updated when the story is started.

menicosia commented 7 years ago

This was released in cf-mysql-release v36. Closing, but always feel free to re-open if there's a continuing issue.

benmoss commented 6 years ago

We're seeing much the same thing on 36.12.0:

2018-04-23 19:53:35 140268210931456 [Warning] mysqld: Disk is full writing '/var/vcap/store/mysql/aria_log.00000001' (Errcode: 28 "No space left on device"). Waiting for someone to free space... (Expect up to 60 secs delay for server to continue after freeing disk space)
2018-04-23 19:53:35 140268210931456 [Warning] mysqld: Retry in 60 secs. Message reprinted in 600 secs
2018-04-23 20:03:35 140268210931456 [Warning] mysqld: Retry in 60 secs. Message reprinted in 600 secs
2018-04-23 20:13:35 140268210931456 [Warning] mysqld: Retry in 60 secs. Message reprinted in 600 secs

but the deployment just hangs until it times out. In this case I think it's that we didn't size our persistent disk to be large enough.

cf-gitbot commented 6 years ago

We have created an issue in Pivotal Tracker to manage this:

https://www.pivotaltracker.com/story/show/157014569

The labels on this github issue will be updated when the story is started.

benmoss commented 6 years ago

It looks like https://github.com/cloudfoundry/cf-mysql-release/blob/66defe82354dfc9826ee3c1fa4e3a91486afd140/jobs/mysql/templates/pre-start-setup.erb#L70-L80 never will be true:

$ df -BMB --output=target,size /var/vcap/store | awk ' NR==2 { print $2 }'
>> 1040MB
$ [[ 1040MB < 10000 ]]
$ echo $?
>> 1
APShirley commented 6 years ago

Well considering we are sorting lexicographically because < is for string comparisons, it could maybe be true?? Either way this is wrong. Will fix.

APShirley commented 6 years ago

There is a way to make it true and it's terrible. 0.5MB.

ctaymor commented 6 years ago

Story has been accepted. Closing issue