Closed StBurcher closed 7 years ago
When you ssh onto the VM, does monit summary
show the route_registrar
as running? If so it might just be taking longer than you expect to start up and, and increasing your update_watch_time
may help.
After redeploying a got the error everywhere
Failed updating job doppler > doppler/0 (b0d7ef00-f2ad-4b17-9634-528af3e9e51b) (canary): 'doppler/0 (b0d7ef00-f2ad-4b17-9634-528af3e9e51b)' is not running after update. Review logs for failed jobs: doppler (00:01:14) Failed updating job loggregator_trafficcontroller > loggregator_trafficcontroller/0 (e532f1d7-bd62-4564-bb56-274076d00f63) (canary): 'loggregator_trafficcontroller/0 (e532f1d7-bd62-4564-bb56-274076d00f63)' is not running after update. Review logs for failed jobs: loggregator_trafficcontroller, metron_agent, route_registrar (00:01:15) Failed updating job route_emitter > route_emitter/0 (0d8a9b0e-bb72-48c1-a0ef-d749a7935953) (canary): 'route_emitter/0 (0d8a9b0e-bb72-48c1-a0ef-d749a7935953)' is not running after update. Review logs for failed jobs: consul_agent, route_emitter, metron_agent (00:01:22) Failed updating job router > router/0 (1ac5183f-2652-4409-9a7a-d4ebc641c323) (canary): 'router/0 (1ac5183f-2652-4409-9a7a-d4ebc641c323)' is not running after update. Review logs for failed jobs: gorouter, metron_agent, consul_agent (00:01:24) Failed updating job brain > brain/0 (b768c8b4-10ae-48ec-a38b-62b819a81d5d) (canary): 'brain/0 (b768c8b4-10ae-48ec-a38b-62b819a81d5d)' is not running after update. Review logs for failed jobs: consul_agent, auctioneer, converger, metron_agent (00:01:24) Failed updating job cc_bridge > cc_bridge/0 (aa452943-63a3-4777-86c1-f69db78b34f3) (canary): 'cc_bridge/0 (aa452943-63a3-4777-86c1-f69db78b34f3)' is not running after update. Review logs for failed jobs: consul_agent, stager, nsync_listener, nsync_bulker, tps_listener, tps_watcher, cc_uploader, metron_agent (00:01:28) Failed updating job access > access/0 (f198a985-f981-45f4-8384-bd12f9bb8995) (canary): 'access/0 (f198a985-f981-45f4-8384-bd12f9bb8995)' is not running after update. Review logs for failed jobs: consul_agent, ssh_proxy, metron_agent, file_server (00:01:29) Failed updating job api > api/0 (414b6459-253f-41f8-a227-641906b33988) (canary): Action Failed get_task: Task 331c4b02-3379-4fca-4828-34b14e6ee1bc result: 1 of 4 pre-start scripts failed. Failed Jobs: cloud_controller_ng. Successful Jobs: cloud_controller_worker, cloud_controller_clock, consul_agent. (00:01:56) Failed updating job cell > cell/0 (ab5d973c-0959-4e0e-9132-5b03a30a6fe8) (canary): 'cell/0 (ab5d973c-0959-4e0e-9132-5b03a30a6fe8)' is not running after update. Review logs for failed jobs: consul_agent, rep, garden, metron_agent (00:02:19) Failed updating job database > database/0 (75ebf467-81b7-413d-b5fc-942694402d2f) (canary): 'database/0 (75ebf467-81b7-413d-b5fc-942694402d2f)' is not running after update. Review logs for failed jobs: consul_agent, etcd, etcd_consistency_checker, bbs, metron_agent (00:02:35) Failed updating job uaa > uaa/0 (fe92dab6-299c-4fe2-ab07-4cf0aa653b17) (canary): 'uaa/0 (fe92dab6-299c-4fe2-ab07-4cf0aa653b17)' is not running after update. Review logs for failed jobs: metron_agent, route_registrar (00:02:38)
Error 400007: 'doppler/0 (b0d7ef00-f2ad-4b17-9634-528af3e9e51b)' is not running after update. Review logs for failed jobs: doppler
Monit summary on uaa shows:
Process 'uaa' running Process 'metron_agent' running Process 'route_registrar' Execution failed System 'system_localhost' running
the log file route_registrar.stderr.log:
panic: nats: No servers available for connection
goroutine 1 [running]: github.com/cloudfoundry-incubator/route-registrar/Godeps/_workspace/src/github.com/pivotal-golang/lager.(*logger).Fatal(0xc8200162a0, 0x7d9850, 0x12, 0x7fce1d89a028, 0xc820011280, 0x0, 0x0, 0x0) /var/vcap/packages/route_registrar/src/github.com/cloudfoundry-incubator/route-registrar/Godeps/_workspace/src/github.com/pivotal-golang/lager/logger.go:152 +0x698 main.main() /var/vcap/packages/route_registrar/src/github.com/cloudfoundry-incubator/route-registrar/main.go:85 +0x126f
goroutine 17 [syscall, locked to thread]: runtime.goexit() /usr/local/go/src/runtime/asm_amd64.s:1721 +0x1
goroutine 5 [syscall]: os/signal.loop() /usr/local/go/src/os/signal/signal_unix.go:22 +0x18 created by os/signal.init.1 /usr/local/go/src/os/signal/signal_unix.go:28 +0x37
goroutine 6 [select, locked to thread]: runtime.gopark(0x836d90, 0xc820022728, 0x7a59e0, 0x6, 0x42d918, 0x2) /usr/local/go/src/runtime/proc.go:185 +0x163 runtime.selectgoImpl(0xc820022728, 0x0, 0x18) /usr/local/go/src/runtime/select.go:392 +0xa64 runtime.selectgo(0xc820022728) /usr/local/go/src/runtime/select.go:212 +0x12 runtime.ensureSigM.func1() /usr/local/go/src/runtime/signal1_unix.go:227 +0x353 runtime.goexit() /usr/local/go/src/runtime/asm_amd64.s:1721 +0x1
@StBurcher seems like nats machines are not reachable/failing?
It getting confused. After removing the whole system and reinstalling it with CF-238 only UAA is not running. Error:
Director task 315 Started preparing deployment > Preparing deployment. Done (00:00:01)
Started preparing package compilation > Finding packages to compile. Done (00:00:01)
Started updating job uaa > uaa/0 (cff050ec-dc6b-4560-9a61-3282f92a3f83) (canary). Failed: 'uaa/0 (cff050ec-dc6b-4560-9a61-3282f92a3f83)' is not running after update. Review logs for failed jobs: metron_agent (00:03:34)
Error 400007: 'uaa/0 (cff050ec-dc6b-4560-9a61-3282f92a3f83)' is not running after update. Review logs for failed jobs: metron_agent
Task 315 error Are you sure you want to deploy? (type 'yes' to continue): yes
Director task 315 Started preparing deployment > Preparing deployment. Done (00:00:01)
Started preparing package compilation > Finding packages to compile. Done (00:00:01)
Started updating job uaa > uaa/0 (cff050ec-dc6b-4560-9a61-3282f92a3f83) (canary). Failed: 'uaa/0 (cff050ec-dc6b-4560-9a61-3282f92a3f83)' is not running after update. Review logs for failed jobs: metron_agent (00:03:34)
Error 400007: 'uaa/0 (cff050ec-dc6b-4560-9a61-3282f92a3f83)' is not running after update. Review logs for failed jobs: metron_agent
Task 315 error
But, bosh instances --ps shows.
| uaa/0 (cff050ec-dc6b-4560-9a61-3282f92a3f83)* | running | INDIA | medium | 10.0.0.40 | | uaa | running | | | | | metron_agent | running | | | | | route_registrar | running | | | |
Since metron_agent
was listed as not running, but it was running when you ran bosh instances --ps
, you might just need to increase the canary_watch_time
/update_watch_time
(http://bosh.io/docs/deployment-manifest.html#update). It might have just needed some additional time to start depending on your environment.
Hi,
thank you that worked. I have change the time.
Hi,
I have a problem with the route_registrar during a cf deployment. I will not start, but only for the UAA. All other registrar entries, i.e. Blobstore etc, are working. I got the following error message.
The UAA part of manifest looks like:
/var/vcap/sys/log/route_registrar/route_registrar.stderr.log is empty.
Last entries in route_registrar.stdout.log are
Why I get this timeout?