cloudfoundry / cf-mysql-release

Cloud Foundry MySQL Release
Apache License 2.0
58 stars 106 forks source link

Switchboard automatic process failure-restart #227

Open 1kaushik1 opened 5 years ago

1kaushik1 commented 5 years ago

Thank you for submitting an issue.

Problem you are trying to solve

Our uaa service fluctuated for a minute due to JDBC connection error We could link the downtime to mysql proxy's switchboard restart. The downtime happened at the same time the process had restarted. Based on the logs can you tell us as to what could be the reason for the failure-restart as it happened twice in 10 days

What is the current broken behavior? Automatic switchboard process failure and restart

Expected:

The process should not fail

Deployment Context:

Please provide relevant details about your deployment. That might include:

Post restart we could see the following set of logs

{"timestamp":"1561118814.904110193","source":"/var/vcap/packages/switchboard/bin/switchboard","message":"/var/vcap/packages/switchboard/bin/switchboard.lock.lost-lock","log_level":2,"data":{"error":"Unexpected response code: 500","key":"v1/locks/mysql_lock","session":"1","value":""}} {"timestamp":"1561118814.904178619","source":"/var/vcap/packages/switchboard/bin/switchboard","message":"/var/vcap/packages/switchboard/bin/switchboard.lock.done","log_level":1,"data":{"key":"v1/locks/mysql_lock","session":"1","value":""}} {"timestamp":"1561118814.904283285","source":"/var/vcap/packages/switchboard/bin/switchboard","message":"/var/vcap/packages/switchboard/bin/switchboard.registration-runner.poll-until-signaled.deregistering-service","log_level":1,"data":{"service":"mysql","session":"2.1","update-interval":"1.5s"}} {"timestamp":"1561118814.920963049","source":"/var/vcap/packages/switchboard/bin/switchboard","message":"/var/vcap/packages/switchboard/bin/switchboard.registration-runner.poll-until-signaled.finished","log_level":1,"data":{"service":"mysql","session":"2.1","update-interval":"1.5s"}} {"timestamp":"1561118814.921007633","source":"/var/vcap/packages/switchboard/bin/switchboard","message":"/var/vcap/packages/switchboard/bin/switchboard.registration-runner.finished","log_level":1,"data":{"service":"mysql","session":"2"}} {"timestamp":"1561118814.921078205","source":"/var/vcap/packages/switchboard/bin/switchboard","message":"/var/vcap/packages/switchboard/bin/switchboard.Received signal","log_level":1,"data":{"signal":2}} {"timestamp":"1561118814.921156883","source":"/var/vcap/packages/switchboard/bin/switchboard","message":"/var/vcap/packages/switchboard/bin/switchboard.Received signal","log_level":1,"data":{"signal":2}} {"timestamp":"1561118814.921186209","source":"/var/vcap/packages/switchboard/bin/switchboard","message":"/var/vcap/packages/switchboard/bin/switchboard.Proxy runner has exited","log_level":1,"data":{}} {"timestamp":"1561118814.921303272","source":"/var/vcap/packages/switchboard/bin/switchboard","message":"/var/vcap/packages/switchboard/bin/switchboard.Switchboard exited unexpectedly","log_level":3,"data":{"error":"Exit trace for group:\nlock exited with error: lock lost\nregistration exited with nil\nhealth exited with nil\nmonitor exited with nil\napi exited with nil\nbridge exited with nil\n","proxyConfig":{"Port":3306,"Backends":[{"Host":"10.3.5.19","Port":3306,"StatusPort":9200,"StatusEndpoint":"galera_status","Name":"backend-0"},{"Host":"10.3.6.19","Port":3306,"StatusPort":9200,"StatusEndpoint":"galera_status","Name":"backend-1"},{"Host":"10.3.7.19","Port":3306,"StatusPort":9200,"StatusEndpoint":"galera_status","Name":"backend-2"}],"HealthcheckTimeoutMillis":5000},"trace":"goroutine 1 [running]:\ngithub.com/cloudfoundry-incubator/switchboard/vendor/code.cloudfoundry.org/lager.(*logger).Fatal(0xc420115da0, 0x8bf522, 0x1f, 0xac1ce0, 0xc4203c7860, 0xc42047c300, 0x1, 0x1)\n\t/var/vcap/packages/switchboard/src/github.com/cloudfoundry-incubator/switchboard/vendor/code.cloudfoundry.org/lager/logger.go:131 +0xc7\nmain.main()\n\t/var/vcap/packages/switchboard/src/github.com/cloudfoundry-incubator/switchboard/main.go:151 +0xd13\n"}} panic: Exit trace for group: lock exited with error: lock lost registration exited with nil health exited with nil monitor exited with nil api exited with nil bridge exited with nil

cf-gitbot commented 5 years ago

We have created an issue in Pivotal Tracker to manage this:

https://www.pivotaltracker.com/story/show/166877449

The labels on this github issue will be updated when the story is started.