cloudfoundry / cloud_controller_ng

Cloud Foundry Cloud Controller
Apache License 2.0
193 stars 358 forks source link

Cloud Controller api worker not processing entries from delayed jobs table #644

Closed lguidotti closed 8 years ago

lguidotti commented 8 years ago

Thanks for submitting an issue to cloud_controller-ng. We are always trying to improve! To help us, please fill out the following template.

Issue

The api worker cloud_controller_worker_1 job stops processing jobs from the delayed jobs table a few minutes after the VM has been restarted. This is occurring on both api workers in the environment. Monit is being used as the job monitoring tool and monit is not recognizing that the job is idle.

There are not recent errors shown in the ctl_err log for the cloud_controller_worker job and the api worker is quickly processing jobs when the job first gets restarted. The amount of time it takes till the cloud_controller_worker_1 job stops processing entries from the delayed_jobs table varies.

This began occurring after a recent upgrade to CF 235.

cf-gitbot commented 8 years ago

We have created an issue in Pivotal Tracker to manage this:

https://www.pivotaltracker.com/story/show/126433711

The labels on this github issue will be updated when the story is started.

lguidotti commented 8 years ago

It was determined that the api worker stops processing after encountering a 500 error from a service broker

/var/vcap/packages/cloud_contr...dor/bundle/ruby/2.2.0/bin/rake ccadmin appcloud xxx 2016-07-19 13:48:46.615 GMTLOG: statement: UPDATE "delayed_jobs" SET "guid" = 'ce18439a-8449-4b7b-93c0-2d52eac7b5b2', "created_at" = '2016-07-19 13:34:35.509139+0000', "updated_at" = CURRENT_TIMESTAMP, "priority" = 8, "attempts" = 3, "handler" = '--- !ruby/object:VCAP::CloudController::Jobs::ExceptionCatchingJob handler: !ruby/object:VCAP::CloudController::Jobs::RequestJob handler: !ruby/object:VCAP::CloudController::Jobs::TimeoutJob handler: !ruby/object:VCAP::CloudController::Jobs::Services::DeleteOrphanedInstance name: service-instance-deprovision client_attrs: :url: http://xxx service_instance_guid: bd1a964f-643e-4f1c-978d-8de8b103d691 service_plan_guid: 11d551c2-8daa-4889-8c4f-4caf77c908e6 request_id: eb217221-0d06-4034-4bf0-00ff6a5e26c4::456cb920-51c3-4605-9657-839dd8ab7975 ', "last_error" = 'Service instance : The service broker returned an invalid response for the request to http://xxx/v2/service_instances/bd1a964f-643e-4f1c-978d-8de8b103d691. Status Code: 500 Internal Server Error, Body:

Internal Server Error

/var/vcap/data/packages/cloud_controller_ng/99480351ce9399b5630cd5e27a3118702ac29281.1-4d2d7b4bbecfd3d7ff6e6ab938b21fe42dbc22de/cloud_controller_ng/lib/services/service_brokers/v2/client.rb:173:in rescue in deprovision'' /var/vcap/data/packages/cloud_controller_ng/99480351ce9399b5630cd5e27a3118702ac29281.1-4d2d7b4bbecfd3d7ff6e6ab938b21fe42dbc22de/cloud_controller_ng/lib/services/service_brokers/v2/client.rb:150:indeprovision'' /var/vcap/data/packages/cloud_controller_ng/99480351ce9399b5630cd5e27a3118702ac29281.1-4d2d7b4bbecfd3d7ff6e6ab938b21fe42dbc22de/cloud_controller_ng/app/jobs/services/delete_orphaned_instance.rb:21:in perform'' /var/vcap/data/packages/cloud_controller_ng/99480351ce9399b5630cd5e27a3118702ac29281.1-4d2d7b4bbecfd3d7ff6e6ab938b21fe42dbc22de/cloud_controller_ng/app/jobs/wrapping_job.rb:9:inperform'' /var/vcap/data/packages/cloud_controller_ng/99480351ce9399b5630cd5e27a3118702ac29281.1-4d2d7b4bbecfd3d7ff6e6ab938b21fe42dbc22de/cloud_controller_ng/app/jobs/timeout_job.rb:7:in block in perform'' /var/vcap/packages/ruby-2.2.4/lib/ruby/2.2.0/timeout.rb:88:inblock in timeout'' /var/vcap/packages/ruby-2.2.4/lib/ruby/2.2.0/timeout.rb:32:in block in catch'' /var/vcap/packages/ruby-2.2.4/lib/ruby/2.2.0/timeout.rb:32:incatch'' /var/vcap/packages/ruby-2.2.4/lib/ruby/2.2.0/timeout.rb:32:in catch'' /var/vcap/packages/ruby-2.2.4/lib/ruby/2.2.0/timeout.rb:103:intimeout'' /var/vcap/data/packages/cloud_controller_ng/99480351ce9399b5630cd5e27a3118702ac29281.1-4d2d7b4bbecfd3d7ff6e6ab938b21fe42dbc22de/cloud_controller_ng/app/jobs/timeout_job.rb:6:in perform'' /var/vcap/data/packages/cloud_controller_ng/99480351ce9399b5630cd5e27a3118702ac29281.1-4d2d7b4bbecfd3d7ff6e6ab938b21fe42dbc22de/cloud_controller_ng/app/jobs/request_job.rb:13:inperform'' /var/vcap/data/packages/cloud_controller_ng/99480351ce9399b5630cd5e27a3118702ac29281.1-4d2d7b4bbecfd3d7ff6e6ab938b21fe42dbc22de/cloud_controller_ng/app/jobs/wrapping_job.rb:9:in perform'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/backend/base.rb:94:inblock in invoke_job'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:61:in call'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:61:inblock in initialize'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:66:in call'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:66:inexecute'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:40:in run_callbacks'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/backend/base.rb:91:ininvoke_job'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/worker.rb:204:in block (2 levels) in run'' /var/vcap/packages/ruby-2.2.4/lib/ruby/2.2.0/timeout.rb:88:inblock in timeout'' /var/vcap/packages/ruby-2.2.4/lib/ruby/2.2.0/timeout.rb:98:in call'' /var/vcap/packages/ruby-2.2.4/lib/ruby/2.2.0/timeout.rb:98:intimeout'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/worker.rb:204:in block in run'' /var/vcap/packages/ruby-2.2.4/lib/ruby/2.2.0/benchmark.rb:303:inrealtime'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/worker.rb:203:in run'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/worker.rb:280:inblock in reserve_and_run_one_job'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:61:in call'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:61:inblock in initialize'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:66:in call'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:66:inexecute'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:40:in run_callbacks'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/worker.rb:280:inreserve_and_run_one_job'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/worker.rb:187:in block in work_off'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/worker.rb:186:intimes'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/worker.rb:186:in work_off'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/worker.rb:150:inblock (4 levels) in start'' /var/vcap/packages/ruby-2.2.4/lib/ruby/2.2.0/benchmark.rb:303:in realtime'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/worker.rb:149:inblock (3 levels) in start'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:61:in call'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:61:inblock in initialize'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:66:in call'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:66:inexecute'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:40:in run_callbacks'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/worker.rb:148:inblock (2 levels) in start'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/worker.rb:147:in loop'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/worker.rb:147:inblock in start'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/plugins/clear_locks.rb:7:in call'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/plugins/clear_locks.rb:7:inblock (2 levels) in class:ClearLocks'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:79:in call'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:79:inblock (2 levels) in add'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:61:in call'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:61:inblock in initialize'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:79:in call'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:79:inblock in add'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:66:in call'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:66:inexecute'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:40:in run_callbacks'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/delayed_job-4.0.6/lib/delayed/worker.rb:146:instart'' /var/vcap/data/packages/cloud_controller_ng/99480351ce9399b5630cd5e27a3118702ac29281.1-4d2d7b4bbecfd3d7ff6e6ab938b21fe42dbc22de/cloud_controller_ng/lib/tasks/jobs.rake:40:in start_working'' /var/vcap/data/packages/cloud_controller_ng/99480351ce9399b5630cd5e27a3118702ac29281.1-4d2d7b4bbecfd3d7ff6e6ab938b21fe42dbc22de/cloud_controller_ng/lib/tasks/jobs.rake:17:inblock (2 levels) in <top (required)>'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/rake-10.5.0/lib/rake/task.rb:240:in call'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/rake-10.5.0/lib/rake/task.rb:240:inblock in execute'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/rake-10.5.0/lib/rake/task.rb:235:in each'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/rake-10.5.0/lib/rake/task.rb:235:inexecute'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/rake-10.5.0/lib/rake/task.rb:179:in block in invoke_with_call_chain'' /var/vcap/packages/ruby-2.2.4/lib/ruby/2.2.0/monitor.rb:211:inmon_synchronize'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/rake-10.5.0/lib/rake/task.rb:172:in invoke_with_call_chain'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/rake-10.5.0/lib/rake/task.rb:165:ininvoke'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/rake-10.5.0/lib/rake/application.rb:150:in invoke_task'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/rake-10.5.0/lib/rake/application.rb:106:inblock (2 levels) in top_level'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/rake-10.5.0/lib/rake/application.rb:106:in each'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/rake-10.5.0/lib/rake/application.rb:106:inblock in top_level'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/rake-10.5.0/lib/rake/application.rb:115:in run_with_threads'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/rake-10.5.0/lib/rake/application.rb:100:intop_level'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/rake-10.5.0/lib/rake/application.rb:78:in block in run'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/rake-10.5.0/lib/rake/application.rb:176:instandard_exception_handling'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/rake-10.5.0/lib/rake/application.rb:75:in run'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/gems/rake-10.5.0/bin/rake:33:in<top (required)>'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/bin/rake:23:in load'' /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/vendor/bundle/ruby/2.2.0/bin/rake:23:in
''', "run_at" = '2016-07-19 13:48:42.711521+0000', "locked_at" = '2016-07-19 13:48:46.579642+0000', "failed_at" = NULL, "locked_by" = 'cc_global_worker.api_worker.0.1', "queue" = 'cc-generic', "cf_api_error" = '--- code: 10001 description: ''Service instance : The service broker returned an invalid response for the request to http://xxx/v2/service_instances/bd1a964f-643e-4f1c-978d-8de8b103d691. Status Code: 500 Internal Server Error, Body:

Internal Server Error

'' error_code: CF-ServiceBrokerBadResponse http: uri: http://xxxv2/service_instances/bd1a964f-643e-4f1c-978d-8de8b103d691 method: DELETE status: 500 ' WHERE ("id" = 7121444)

ameowlia commented 8 years ago

Hi @lguidotti,

There is a known issue around vm restarts that sounds like it might be the root cause of your issue. The fix will be released in CF v240.

The workaround, mentioned in the release notes, is below:

To workaround this issue, operators can do a bosh deploy, which will recognize the failing jobs and properly create the run directory.

If this fixes your problem, please close this issue.

Thanks, @adowns01 & @rizwanreza , CAPI Team Members