cloudfoundry-attic / bosh-init

bosh-init is a tool used to create and update the Director VM
Apache License 2.0
31 stars 33 forks source link

CPI 'create_stemcell' method responded with error: read timeout reached #106

Open srinat999 opened 7 years ago

srinat999 commented 7 years ago

So I was trying to bootstrap a BOSH environment on OpenStack using the instructions given here and I got this error at the bosh-init deploy step

Starting registry... Finished (00:00:00)
Uploading stemcell 'bosh-openstack-kvm-ubuntu-trusty-go_agent/3262.9'... Failed (00:01:09)
Stopping registry... Finished (00:00:00)
Cleaning up rendered CPI jobs... Finished (00:00:00)

Command 'deploy' failed:
  creating stemcell (bosh-openstack-kvm-ubuntu-trusty-go_agent 3262.9):
    CPI 'create_stemcell' method responded with error: CmdError{"type":"Unknown","message":"read timeout reached","ok_to_retry":false}

I'm assuming this is the step where it is trying to upload the stemcell as an image to the OpenStack and for some reason it is failing. I'm not sure whether its an error at my OpenStack side as creation of a normal Ubuntu server image from here also throws an error in OpenStack.

For reference here is my bosh.yml


---
name: bosh

releases:
- name: bosh
  url: https://bosh.io/d/github.com/cloudfoundry/bosh?v=257.9
  sha1: 3d6168823f5a8aa6b7427429bc727103e15e27af
- name: bosh-openstack-cpi
  url: https://bosh.io/d/github.com/cloudfoundry-incubator/bosh-openstack-cpi-release?v=27
  sha1: 85e6244978f775c888bbd303b874a2c158eb43c4

resource_pools:
- name: vms
  network: private
  stemcell:
    url: https://bosh.io/d/stemcells/bosh-openstack-kvm-ubuntu-trusty-go_agent?v=3262.9
    sha1: 454286bacf95fbbb82e956ed014964fc7c8eda97
  cloud_properties:
    instance_type: m1.xlarge

disk_pools:
- name: disks
  disk_size: 20_000

networks:
- name: private
  type: manual
  subnets:
  - range: 10.0.0.0/24 # <--- Replace with a private subnet CIDR
    gateway: 10.0.0.1 # <--- Replace with a private subnet's gateway
    dns: [10.0.0.2] # <--- Replace with your DNS
    cloud_properties: {net_id: 99828b4e-cb2b-45bf-97e2-852388bc16c0} # <--- # Replace with private network UUID
- name: public
  type: vip

jobs:
- name: bosh
  instances: 1

  templates:
  - {name: nats, release: bosh}
  - {name: postgres, release: bosh}
  - {name: blobstore, release: bosh}
  - {name: director, release: bosh}
  - {name: health_monitor, release: bosh}
  - {name: registry, release: bosh}
  - {name: openstack_cpi, release: bosh-openstack-cpi}

  resource_pool: vms
  persistent_disk_pool: disks

  networks:
  - name: private
    static_ips: [10.0.0.200] # <--- Replace with a private IP
    default: [dns, gateway]
  - name: public
    static_ips: [139.25.25.201] # <--- Replace with a floating IP

  properties:
    nats:
      address: 127.0.0.1
      user: nats
      password: nats-password

    postgres: &db
      listen_address: 127.0.0.1
      host: 127.0.0.1
      user: postgres
      password: postgres-password
      database: bosh
      adapter: postgres

    registry:
      address: 10.0.0.200 # <--- Replace with a private IP
      host: 10.0.0.200 # <--- Replace with a private IP
      db: *db
      http: {user: admin, password: admin, port: 25777}
      username: admin
      password: admin
      port: 25777

    blobstore:
      address: 10.0.0.200 # <--- Replace with a private IP
      port: 25250
      provider: dav
      director: {user: director, password: director-password}
      agent: {user: agent, password: agent-password}

    director:
      address: 127.0.0.1
      name: my-bosh
      db: *db
      cpi_job: openstack_cpi
      max_threads: 3
      user_management:
        provider: local
        local:
          users:
          - {name: admin, password: admin}
          - {name: hm, password: hm-password}

    hm:
      director_account: {user: hm, password: hm-password}
      resurrector_enabled: true

    openstack: &openstack
      auth_url: https://10.10.0.10:5000/v2.0/tokens # <--- Replace with OpenStack Identity API endpoint
      project: CF # <--- Replace with OpenStack project name
      tenant: CF
      domain: xxxxx # <--- Replace with OpenStack domain name
      username: xxxx # <--- Replace with OpenStack username
      api_key: xxxxx # <--- Replace with OpenStack password
      default_key_name: bosh
      default_security_groups: [bosh]

    agent: {mbus: "nats://nats:nats-password@10.0.0.200:4222"} # <--- Replace with a private IP

    ntp: &ntp [0.pool.ntp.org, 1.pool.ntp.org]

cloud_provider:
  template: {name: openstack_cpi, release: bosh-openstack-cpi}

  ssh_tunnel:
    host: 139.25.25.201 # <--- Replace with a floating IP
    port: 22
    user: vcap
    private_key: ./bosh.pem # Path relative to this manifest file

  mbus: "https://mbus:mbus-password@139.25.25.201:6868" # <--- Replace with a floating IP

  properties:
    openstack: *openstack
    agent: {mbus: "https://mbus:mbus-password@0.0.0.0:6868"}
    blobstore: {provider: local, path: /var/vcap/micro_bosh/data/cache}
    ntp: *ntp

I'm also under a proxy and I have added the relevant IPs to my no_proxy variable.

dpb587-pivotal commented 7 years ago

Since you're using a proxy, you'll need to add an env section with your http_proxy settings. The referenced properties are listed on bosh.io. Something like...

cloud_provider:
  properties:
    openstack: *openstack
    env:
      http_proxy: ((...your proxy...))

And also add env with the proxy settings to jobs[0].properties as well if your director will need to use a proxy.

srinat999 commented 7 years ago

Hi,

Thanks for the response. Unfortunately it still doesn't go through. Could you confirm that this is the step where it is trying to upload the stemcell to OpenStack? If you can point me to towards the API that it is using, I can manually check if the API is working?

Thanks Sreenath

beyhan commented 7 years ago

Hi @srinat999,

Yes, it's the step where the BOSH OpenStack CPI uploads the stemcell. Here you can find which APIs are used for this in case of Image Service API v2. Which version do you have?Documentation about OpenStack Image Service API v2 is here

Beyhan

srinat999 commented 7 years ago

Hi @beyhan

I investigated the issue with OpenStack and it was indeed a problem with the image upload API. I fixed it and so I was able to progress further. However, in the instance creation step the process fails. This is the error that I get:

Uploading stemcell 'bosh-openstack-kvm-ubuntu-trusty-go_agent/3262.9'... Skipped [Stemcell already uploaded] (00:00:00)

Started deploying
  Creating VM for instance 'bosh/0' from stemcell '313ba8c2-d86f-49f8-bf88-c037ca9e9f08'... Failed (00:02:49)
Failed deploying (00:02:49)

Stopping registry... Finished (00:00:00)
Cleaning up rendered CPI jobs... Finished (00:00:00)

Command 'deploy' failed:
  Deploying:
    Creating instance 'bosh/0':
      Creating VM:
        Creating vm with stemcell cid '313ba8c2-d86f-49f8-bf88-c037ca9e9f08':
          CPI 'create_vm' method responded with error: CmdError{"type":"Bosh::Clouds::VMCreationFailed","message":"Cannot update settings for 'vm-16604189-41ce-4c22-ba7b-598fd80f4a76', got HTTP 301","ok_to_retry":false}

I also tried the api for creating an instance:

curl -v -s -X POST -H "X-Auth-Token: xxxx" https://10.10.0.10:8774/v2/466a67b24b694f36b48defabd2ebe751/servers -d '{"server": {"name": "auto-allocate-network","imageRef": "313ba8c2-d86f-49f8-bf88-c037ca9e9f08","flavorRef": "http://openstack.example.com/flavors/1","networks":[{"uuid":"99828b4e-cb2b-45bf-97e2-852388bc16c0"}]}}' -H "Content-Type: application/json"

This works and the instance is created. What do you think the problem is?

Thanks Sreenath

beyhan commented 7 years ago

Hi @srinat999,

Error happens after VM creation. Bosh-init fails to update the BOSH registry with the settings of the VM which has been created. Bosh-init starts a BOSH registry on your machine where bosh-init is executed. The registry is required for the deployment and it's available on port 6901 (see here). After VM creation bosh-init executes a put request with the url http://127.0.0.1:6901/instances/<instance_id>/settings to update the registry. In your case you are getting an http response of 301. Is something else running on port `6901?