cloudfoundry / bosh

Cloud Foundry BOSH is an open source tool chain for release engineering, deployment and lifecycle management of large scale distributed services.
https://bosh.io
Apache License 2.0
2.03k stars 658 forks source link

CentOS stemcell fails to start agent when using bosh-init #846

Closed jghiloni closed 9 years ago

jghiloni commented 9 years ago

I tried to create a new BOSH environment using the instructions on bosh.io using the centos 7 stemcell. However, it would always timeout waiting for the mbus agent to start, and would fail with connection refused. The exact same procedure with the Ubuntu Trusty stemcell worked.

My bosh.yml (sanitized):


---
name: bosh

releases:
- name: bosh
  url: https://bosh.io/d/github.com/cloudfoundry/bosh
  sha1: 3bde287f992a913b568e44fdeb296b5b8306a110
- name: bosh-openstack-cpi
  url: http://bosh.io/d/github.com/cloudfoundry-incubator/bosh-openstack-cpi-release
  sha1: 6110cb3a39e2f4491562ae48fcc6cf9cbcf3ed7d

resource_pools:
- name: vms
  network: private
  stemcell:
    url: http://dxd58qms1.digitalglobe.com/bosh-stemcell-2982-openstack-kvm-ubuntu-trusty-go_agent.tgz
    sha1: 455e92cbce010a7b4de68cd281f33fcfa085509d
  cloud_properties:
    instance_type: m1.medium

disk_pools:
- name: disks
  disk_size: 10_000

networks:
- name: private
  type: manual
  subnets:
  - range: 192.168.245.0/24
    gateway: 192.168.245.1
    dns: [10.149.52.20, 10.52.30.13]
    cloud_properties: {net_id: a7f0fee6-9ac9-428a-bf70-7b3a4765b729}
- name: public
  type: vip

jobs:
- name: bosh
  instances: 1

  templates:
  - {name: nats, release: bosh}
  - {name: redis, release: bosh}
  - {name: postgres, release: bosh}
  - {name: blobstore, release: bosh}
  - {name: director, release: bosh}
  - {name: health_monitor, release: bosh}
  - {name: registry, release: bosh}
  - {name: cpi, release: bosh-openstack-cpi}

  resource_pool: vms
  persistent_disk_pool: disks

  networks:
  - name: private
    static_ips: [192.168.245.2]
    default: [dns, gateway]
  - name: public
    static_ips: [10.52.46.182]

  properties:
    nats:
      address: 127.0.0.1
      user: nats
      password: nats-password

    redis:
      listen_addresss: 127.0.0.1
      address: 127.0.0.1
      password: redis-password

    postgres: &db
      host: 127.0.0.1
      user: postgres
      password: postgres-password
      database: bosh
      adapter: postgres

    registry:
      address: 192.168.254.2
      host: 192.168.254.2
      db: *db
      http: {user: admin, password: admin, port: 25777}
      username: admin
      password: admin
      port: 25777

    blobstore:
      address: 192.168.254.2
      port: 25250
      provider: dav
      director: {user: director, password: director-password}
      agent: {user: agent, password: agent-password}

    director:
      address: 127.0.0.1
      name: my-bosh
      db: *db
      cpi_job: cpi
      max_threads: 3

    hm:
      http: {user: hm, password: hm-password}
      director_account: {user: admin, password: admin}
      resurrector_enabled: true

    openstack: &openstack
      auth_url: http://10.52.148.247:5000/v2.0/tokens
      tenant: consul-dev1
      username: jghiloni
      api_key: ********
      default_key_name: microbosh
      default_security_groups: [bosh]

    agent: {mbus: "nats://nats:nats-password@192.168.254.2:4222"}

    ntp: &ntp [0.pool.ntp.org, 1.pool.ntp.org]

cloud_provider:
  template: {name: cpi, release: bosh-openstack-cpi}

  ssh_tunnel:
    host: 10.52.46.182
    port: 22
    user: vcap
    private_key: ./bosh.pem # Path relative to this manifest file

  mbus: "https://mbus:mbus-password@10.52.46.182:6868"

  properties:
    openstack: *openstack
    agent: {mbus: "https://mbus:mbus-password@0.0.0.0:6868"}
    blobstore: {provider: local, path: /var/vcap/micro_bosh/data/cache}
    ntp: *ntp

I have the debug log as well, but it's about 1.5 MB ... I can place it somewhere if needed.

cppforlife commented 9 years ago

what does cat /var/vcap/bosh/log/current | grep -B2 "Starting agent" on the created VM show? most likely agent is failing to bootstrap either due to networking or disk problem.

cppforlife commented 9 years ago

Reopen if issue persists.