cloudfoundry / bosh

Cloud Foundry BOSH is an open source tool chain for release engineering, deployment and lifecycle management of large scale distributed services.
https://bosh.io
Apache License 2.0
2.03k stars 658 forks source link

unresponsive agent VMs deployed from child BOSH #1123

Closed subhankarc closed 7 years ago

subhankarc commented 8 years ago

I have BOSH-lite deployed on my local machine and I have deployed bosh deploy BOSH using the following mainfest.


---
name: microbosh

director_uuid: 35b3e645-7b07-435a-bfab-69f718e0e8ef

releases:
- {name: bosh,            version: 244}
- {name: bosh-warden-cpi, version: latest}
- {name: garden-linux,    version: latest}

jobs:
- name: bosh
  instances: 1
  templates:
  - {name: nats, release: bosh}
  - {name: redis, release: bosh}
  - {name: blobstore, release: bosh}
  - {name: postgres, release: bosh}
  - {name: director, release: bosh}
  - {name: health_monitor, release: bosh}
  - {name: warden_cpi, release: bosh-warden-cpi}
  - {name: garden, release: garden-linux}
  resource_pool: default
  persistent_disk: 8_000
  networks:
  - {name: default, static_ips: [10.244.9.2]}
  cloud_properties:
   cpu: 1
   ram: 512
   disk: 1_000

properties:
  ntp: ["0.us.pool.ntp.org"]

  nats:
    address: 127.0.0.1
    user: nats
    password: admin

  blobstore:
    address: 127.0.0.1
    port: 25251
    backend_port: 25552
    agent: {user: agent, password: admin}
    director: {user: director, password: admin}

  postgres: &db
    host: 127.0.0.1
    port: 5432
    user: bosh
    password: admin
    database: bosh

  redis:
    address: 127.0.0.1
    password: admin

  director:
    name: bats
    address: 10.244.9.2
    encryption: false
    backend_port: 25556
    endpoint: https://10.244.9.2:25556
    enable_snapshots: true
    cpi_job: warden_cpi
    db: *db

    # Compatibility with current garden networking manifest config
    ignore_missing_gateway: true

  hm:
    http:
      port: 25923
      user: admin
      password: admin
    director_account:
      user: admin
      password: admin
    intervals:
      poll_director: 60
      poll_grace_period: 30
      log_stats: 300
      analyze_agents: 60
      agent_timeout: 180
      rogue_agent_alert: 180
    loglevel: info
    email_notifications: false
    tsdb_enabled: false
    cloud_watch_enabled: false
    resurrector_enabled: true

  # cpi job template
  warden_cpi:
    host_ip: 10.244.9.2
  #  loopback_range: [100, 130]
    warden:
      connect_network: tcp
      connect_address: 127.0.0.1:7777
    actions:
      stemcells_dir: "/var/vcap/data/warden_cpi/stemcells"
      disks_dir: "/var/vcap/store/warden_cpi/disks"
      host_ephemeral_bind_mounts_dir: "/var/vcap/data/warden_cpi/ephemeral_bind_mounts_dir"
      host_persistent_bind_mounts_dir: "/var/vcap/data/warden_cpi/persistent_bind_mounts_dir"
    agent:
      mbus: nats://nats:admin@10.244.9.2:4222
      blobstore:
        provider: dav
        options:
          endpoint: http://10.244.9.2:25251
          user: agent
          password: admin

  # garden job template
  garden:
    listen_network: tcp
    listen_address: 0.0.0.0:7777
    disk_quota_enabled: false
    allow_host_access: true
    # todo expecting fix for this from garden team
    mount_btrfs_loopback: false

resource_pools:
- name: default
  stemcell:
    name: bosh-warden-boshlite-ubuntu-trusty-go_agent
    version: latest
  network: default
  cloud_properties:
   cpu: 1
   ram: 512
   disk: 1_000

networks:
- name: default
  type: manual
  subnets:
  - range: 10.244.9.0/30
    reserved: [10.244.9.1]
    static: [10.244.9.2]
    cloud_properties: {}
- name: compilation
  type: dynamic
  cloud_properties: {}

compilation:
  workers: 5
  network: compilation
  cloud_properties:
   ram: 2048
   disk: 6_000
   cpu: 1

update:
  canaries: 1
  max_in_flight: 1
  canary_watch_time: 5000 - 90000
  update_watch_time: 5000 - 15000

Now I tried to target this bosh director using bosh target 10.244.9.2 .

In this new director I can upload stemcell, upload release and then try to deploy a release. It works successfully. But after few minutes, the VMs of this deployment done from the new director becomes unresponsive.

$ bosh vms
Deployment `redis-warden'

Director task 80

Task 80 done

+-----------------+--------------------+---------------+-----+
| Job/index       | State              | Resource Pool | IPs |
+-----------------+--------------------+---------------+-----+
| unknown/unknown | unresponsive agent |               |     |
| unknown/unknown | unresponsive agent |               |     |
| unknown/unknown | unresponsive agent |               |     |
| unknown/unknown | unresponsive agent |               |     |
| unknown/unknown | unresponsive agent |               |     |
+-----------------+--------------------+---------------+-----+

Can someone help me with this if I am missing something here ?

Regards, Subhankar

dpb587-pivotal commented 8 years ago

@subhankarc, is this the same issue as the mailing list post? (just in case others are looking for more detail)

subhankarc commented 8 years ago

Yes it is the same post. Thanks for tagging it.

cppforlife commented 8 years ago

@subhankarc while bosh vms is showing information successfully. can you record the ips shown and try to ssh into one of the vms. from there please grab http://bosh.io/docs/job-logs.html#agent-logs. we've not seen this situation happen after a few mins. this typically would happen after laptop goes to sleep (https://github.com/cloudfoundry/bosh-lite/blob/master/docs/bosh-cck.md).

subhankarc commented 8 years ago

@cppforlife The VMs goes down within 4-5 minutes of deployment, so no question of laptop sleep.

Here are the agent logs as you asked for below which I could collect after which the VM went down. It is a redis cluster that I deploy. I can connect to redis server till the time it stays alive. Also please see this post where I have pasted the vi /var/vcap/sys/log/health_monitor/health_monitor.log details which shows somehting about unmanaged agent. Looks like HM itself killing the VMs.

2016-02-05_09:13:13.99684 ********************
2016-02-05_09:13:13.99711 [Action Dispatcher] 2016/02/05 09:13:13 INFO - Running async action run_script
2016-02-05_09:13:13.99756 [MBus Handler] 2016/02/05 09:13:13 INFO - Responding
2016-02-05_09:13:13.99885 [MBus Handler] 2016/02/05 09:13:13 DEBUG - Payload
2016-02-05_09:13:13.99888 ********************
2016-02-05_09:13:13.99918 {"value":{"agent_task_id":"f1eeb4a6-44e2-4495-7f64-b0b846439bd6","state":"running"}}
2016-02-05_09:13:13.99920 ********************
2016-02-05_09:13:13.99920 [File System] 2016/02/05 09:13:13 DEBUG - Checking if file exists /var/vcap/bosh/spec.json
2016-02-05_09:13:13.99921 [File System] 2016/02/05 09:13:13 DEBUG - Reading file /var/vcap/bosh/spec.json
2016-02-05_09:13:13.99923 [File System] 2016/02/05 09:13:13 DEBUG - Read content
2016-02-05_09:13:13.99925 ********************
2016-02-05_09:13:13.99925 {"properties":{"logging":{"max_log_file_size":""}},"job":{"name":"redis_leader_z1","release":"","template":"redis","version":"1c917ee3b8d3639bf788777602fa4dc74647dcdf","sha1":"d6cc4965cbd7ba2eb8ca1932205b4d6bd16ef168","blobstore_id":"dceaec35-cf06-43b5-92ff-9d8982d5a82e","templates":[{"name":"redis","version":"1c917ee3b8d3639bf788777602fa4dc74647dcdf","sha1":"d6cc4965cbd7ba2eb8ca1932205b4d6bd16ef168","blobstore_id":"dceaec35-cf06-43b5-92ff-9d8982d5a82e"}]},"packages":{"redis-server":{"name":"redis-server","version":"b53d5357ab95a74c9489cd98a024e6ef6047aba0.1","sha1":"3ecef141a5da0f4889d37368f3bc6d844f76feb9","blobstore_id":"2b9a26df-91d7-46cb-7adc-7a2e13e9a7d4"}},"configuration_hash":"32fa4ba645f6875371db43a43cb2029898276c92","networks":{"redis1":{"cloud_properties":{"name":"random"},"default":["dns","gateway"],"dns_record_name":"0.redis-leader-z1.redis1.redis-warden.bosh","ip":"10.244.2.2","netmask":"255.255.255.252"}},"resource_pool":{"cloud_properties":{"name":"random"},"name":"small_z1","stemcell":{"name":"bosh-warden-boshlite-ubuntu-trusty-go_agent","version":"3147"}},"deployment":"redis-warden","index":0,"id":"","persistent_disk":4096,"rendered_templates_archive":{"sha1":"0a0b6cba58fcb64720ac9ef094787983741fe95e","blobstore_id":"2dd5fd15-ccdc-446a-a248-981bfecef8ca"}}
2016-02-05_09:13:13.99931 ********************
2016-02-05_09:13:13.99932 [File System] 2016/02/05 09:13:13 DEBUG - Checking if file exists /var/vcap/jobs/redis/bin/pre-start
2016-02-05_09:13:13.99933 [ParallelScript] 2016/02/05 09:13:13 DEBUG - Did not find 'pre-start' script in job 'redis'
2016-02-05_09:13:13.99934 [ParallelScript] 2016/02/05 09:13:13 INFO - Will run 0 pre-start scripts in parallel
2016-02-05_09:13:14.00135 [MBus Handler] 2016/02/05 09:13:14 INFO - Received request with action get_task
2016-02-05_09:13:14.00137 [MBus Handler] 2016/02/05 09:13:14 DEBUG - Payload
2016-02-05_09:13:14.00138 ********************
2016-02-05_09:13:14.00139 {"protocol":2,"method":"get_task","arguments":["f1eeb4a6-44e2-4495-7f64-b0b846439bd6"],"reply_to":"director.bdfa0a50-23b6-4947-92e1-90b6eef26b8e.696be3d8-e352-460b-b94c-9154ae36774c"}
2016-02-05_09:13:14.00140 ********************
2016-02-05_09:13:14.00141 [Action Dispatcher] 2016/02/05 09:13:14 INFO - Running sync action get_task
2016-02-05_09:13:14.00157 [MBus Handler] 2016/02/05 09:13:14 INFO - Responding
2016-02-05_09:13:14.00188 [MBus Handler] 2016/02/05 09:13:14 DEBUG - Payload
2016-02-05_09:13:14.00190 ********************
2016-02-05_09:13:14.00191 {"value":{}}
2016-02-05_09:13:14.00192 ********************
2016-02-05_09:13:14.00508 [MBus Handler] 2016/02/05 09:13:14 INFO - Received request with action start
2016-02-05_09:13:14.00509 [MBus Handler] 2016/02/05 09:13:14 DEBUG - Payload
2016-02-05_09:13:14.00510 ********************
2016-02-05_09:13:14.00511 {"protocol":2,"method":"start","arguments":[],"reply_to":"director.bdfa0a50-23b6-4947-92e1-90b6eef26b8e.806add20-89c7-4fc3-9b18-c29d6ffa476a"}
2016-02-05_09:13:14.00513 ********************
2016-02-05_09:13:14.00540 [Action Dispatcher] 2016/02/05 09:13:14 INFO - Running sync action start
2016-02-05_09:13:14.00561 [File System] 2016/02/05 09:13:14 DEBUG - Checking if file exists /var/vcap/bosh/spec.json
2016-02-05_09:13:14.00578 [File System] 2016/02/05 09:13:14 DEBUG - Reading file /var/vcap/bosh/spec.json
2016-02-05_09:13:14.00597 [File System] 2016/02/05 09:13:14 DEBUG - Read content
2016-02-05_09:13:14.00599 ********************
2016-02-05_09:13:14.00600 {"properties":{"logging":{"max_log_file_size":""}},"job":{"name":"redis_leader_z1","release":"","template":"redis","version":"1c917ee3b8d3639bf788777602fa4dc74647dcdf","sha1":"d6cc4965cbd7ba2eb8ca1932205b4d6bd16ef168","blobstore_id":"dceaec35-cf06-43b5-92ff-9d8982d5a82e","templates":[{"name":"redis","version":"1c917ee3b8d3639bf788777602fa4dc74647dcdf","sha1":"d6cc4965cbd7ba2eb8ca1932205b4d6bd16ef168","blobstore_id":"dceaec35-cf06-43b5-92ff-9d8982d5a82e"}]},"packages":{"redis-server":{"name":"redis-server","version":"b53d5357ab95a74c9489cd98a024e6ef6047aba0.1","sha1":"3ecef141a5da0f4889d37368f3bc6d844f76feb9","blobstore_id":"2b9a26df-91d7-46cb-7adc-7a2e13e9a7d4"}},"configuration_hash":"32fa4ba645f6875371db43a43cb2029898276c92","networks":{"redis1":{"cloud_properties":{"name":"random"},"default":["dns","gateway"],"dns_record_name":"0.redis-leader-z1.redis1.redis-warden.bosh","ip":"10.244.2.2","netmask":"255.255.255.252"}},"resource_pool":{"cloud_properties":{"name":"random"},"name":"small_z1","stemcell":{"name":"bosh-warden-boshlite-ubuntu-trusty-go_agent","version":"3147"}},"deployment":"redis-warden","index":0,"id":"","persistent_disk":4096,"rendered_templates_archive":{"sha1":"0a0b6cba58fcb64720ac9ef094787983741fe95e","blobstore_id":"2dd5fd15-ccdc-446a-a248-981bfecef8ca"}}
2016-02-05_09:13:14.00607 ********************
2016-02-05_09:13:14.00710 [renderedJobApplier] 2016/02/05 09:13:14 DEBUG - Configuring job {redis 1c917ee3b8d3639bf788777602fa4dc74647dcdf {0a0b6cba58fcb64720ac9ef094787983741fe95e 2dd5fd15-ccdc-446a-a248-981bfecef8ca redis} [{redis-server b53d5357ab95a74c9489cd98a024e6ef6047aba0.1 {3ecef141a5da0f4889d37368f3bc6d844f76feb9 2b9a26df-91d7-46cb-7adc-7a2e13e9a7d4 }}]} with index 0
2016-02-05_09:13:14.00729 [File System] 2016/02/05 09:13:14 DEBUG - Checking if file exists /var/vcap/data/jobs/redis/1c917ee3b8d3639bf788777602fa4dc74647dcdf-0a0b6cba58fcb64720ac9ef094787983741fe95e
2016-02-05_09:13:14.00748 [File System] 2016/02/05 09:13:14 DEBUG - Checking if file exists /var/vcap/data/jobs/redis/1c917ee3b8d3639bf788777602fa4dc74647dcdf-0a0b6cba58fcb64720ac9ef094787983741fe95e/monit
2016-02-05_09:13:14.00766 [File System] 2016/02/05 09:13:14 DEBUG - Reading file /var/vcap/data/jobs/redis/1c917ee3b8d3639bf788777602fa4dc74647dcdf-0a0b6cba58fcb64720ac9ef094787983741fe95e/monit
2016-02-05_09:13:14.00783 [File System] 2016/02/05 09:13:14 DEBUG - Read content
2016-02-05_09:13:14.00784 ********************
2016-02-05_09:13:14.00785 check process redis
2016-02-05_09:13:14.00786   with pidfile /var/vcap/sys/run/redis/redis.pid
2016-02-05_09:13:14.00787   start program "/var/vcap/jobs/redis/bin/monit_debugger redis_ctl '/var/vcap/jobs/redis/bin/redis_ctl start'"
2016-02-05_09:13:14.00788   stop program "/var/vcap/jobs/redis/bin/monit_debugger redis_ctl '/var/vcap/jobs/redis/bin/redis_ctl stop'"
2016-02-05_09:13:14.00789   group vcap
2016-02-05_09:13:14.00790
2016-02-05_09:13:14.00792 ********************
2016-02-05_09:13:14.00810 [File System] 2016/02/05 09:13:14 DEBUG - Writing /var/vcap/monit/job/0000_redis.monitrc
2016-02-05_09:13:14.00829 [File System] 2016/02/05 09:13:14 DEBUG - Making dir /var/vcap/monit/job with perm 511
2016-02-05_09:13:14.00884 [File System] 2016/02/05 09:13:14 DEBUG - Write content
2016-02-05_09:13:14.00886 ********************
2016-02-05_09:13:14.00887 check process redis
2016-02-05_09:13:14.00888   with pidfile /var/vcap/sys/run/redis/redis.pid
2016-02-05_09:13:14.00889   start program "/var/vcap/jobs/redis/bin/monit_debugger redis_ctl '/var/vcap/jobs/redis/bin/redis_ctl start'"
2016-02-05_09:13:14.00891   stop program "/var/vcap/jobs/redis/bin/monit_debugger redis_ctl '/var/vcap/jobs/redis/bin/redis_ctl stop'"
2016-02-05_09:13:14.00892   group vcap
2016-02-05_09:13:14.00893
2016-02-05_09:13:14.00894 ********************
2016-02-05_09:13:14.00929 [File System] 2016/02/05 09:13:14 DEBUG - Glob '/var/vcap/data/jobs/redis/1c917ee3b8d3639bf788777602fa4dc74647dcdf-0a0b6cba58fcb64720ac9ef094787983741fe95e/*.monit'
2016-02-05_09:13:14.00961 [http-client] 2016/02/05 09:13:14 DEBUG - status function called
2016-02-05_09:13:14.00979 [http-client] 2016/02/05 09:13:14 DEBUG - Monit request: url='http://127.0.0.1:2822/_status2?format=xml' body=''
2016-02-05_09:13:14.00998 [attemptRetryStrategy] 2016/02/05 09:13:14 DEBUG - Making attempt #0
2016-02-05_09:13:14.01031 [clientRetryable] 2016/02/05 09:13:14 DEBUG - [requestID=1eda963b-d855-4642-54cf-f7e1248b2c77] Requesting (attempt=1): Request{ Method: 'GET', URL: 'http://127.0.0.1:2822/_status2?format=xml' }
2016-02-05_09:13:14.01951 [clientRetryable] 2016/02/05 09:13:14 DEBUG - [requestID=1eda963b-d855-4642-54cf-f7e1248b2c77] Request succeeded (attempts=1), response: Response{ StatusCode: 200, Status: '200 OK' }
2016-02-05_09:13:14.02684 [Cmd Runner] 2016/02/05 09:13:14 DEBUG - Running command: monit reload
2016-02-05_09:13:14.03966 [Cmd Runner] 2016/02/05 09:13:14 DEBUG - Stdout:
2016-02-05_09:13:14.03968 [Cmd Runner] 2016/02/05 09:13:14 DEBUG - Stderr: Reinitializing monit daemon
2016-02-05_09:13:14.03969 [Cmd Runner] 2016/02/05 09:13:14 DEBUG - Successful: true (0)
2016-02-05_09:13:14.03970 [http-client] 2016/02/05 09:13:14 DEBUG - status function called
2016-02-05_09:13:14.03971 [http-client] 2016/02/05 09:13:14 DEBUG - Monit request: url='http://127.0.0.1:2822/_status2?format=xml' body=''
2016-02-05_09:13:14.03971 [attemptRetryStrategy] 2016/02/05 09:13:14 DEBUG - Making attempt #0
2016-02-05_09:13:14.03972 [clientRetryable] 2016/02/05 09:13:14 DEBUG - [requestID=1a0193b1-b340-4ba3-67e4-82cc5ba36ef7] Requesting (attempt=1): Request{ Method: 'GET', URL: 'http://127.0.0.1:2822/_status2?format=xml' }
2016-02-05_09:13:14.04979 [clientRetryable] 2016/02/05 09:13:14 DEBUG - [requestID=1a0193b1-b340-4ba3-67e4-82cc5ba36ef7] Request succeeded (attempts=1), response: Response{ StatusCode: 200, Status: '200 OK' }
2016-02-05_09:13:14.04980 [monitJobSupervisor] 2016/02/05 09:13:14 DEBUG - Waiting for monit to reload: before=1454663588 after=1454663588
2016-02-05_09:13:14.08297 2016/02/05 09:13:14 mail from: "monit@localhost"
2016-02-05_09:13:14.12529 [agent] 2016/02/05 09:13:14 DEBUG - Ignored monit event: %!(EXTRA string=Monit instance changed)
2016-02-05_09:13:14.14790 2016/02/05 09:13:14 mail from: "monit@localhost"
2016-02-05_09:13:14.19110 [NATS Handler] 2016/02/05 09:13:14 INFO - Sending hm message 'alert'
2016-02-05_09:13:14.19849 [NATS Handler] 2016/02/05 09:13:14 DEBUG - Message Payload
2016-02-05_09:13:14.19852 ********************
2016-02-05_09:13:14.19854 {"id":"1454663594.2078603955@fa299ad9-f868-481b-a63e-c45522fafe8c","severity":1,"title":"redis (10.244.2.2) - Does not exist - restart","summary":"process is not running","created_at":1454663594}
2016-02-05_09:13:14.19859 ********************
2016-02-05_09:13:19.06811 [http-client] 2016/02/05 09:13:19 DEBUG - status function called
2016-02-05_09:13:19.06817 [http-client] 2016/02/05 09:13:19 DEBUG - Monit request: url='http://127.0.0.1:2822/_status2?format=xml' body=''
2016-02-05_09:13:19.06818 [attemptRetryStrategy] 2016/02/05 09:13:19 DEBUG - Making attempt #0
2016-02-05_09:13:19.06820 [clientRetryable] 2016/02/05 09:13:19 DEBUG - [requestID=13785cbc-8626-4fc3-5264-c198c44e685e] Requesting (attempt=1): Request{ Method: 'GET', URL: 'http://127.0.0.1:2822/_status2?format=xml' }
2016-02-05_09:13:19.06821 [clientRetryable] 2016/02/05 09:13:19 DEBUG - [requestID=13785cbc-8626-4fc3-5264-c198c44e685e] Request succeeded (attempts=1), response: Response{ StatusCode: 200, Status: '200 OK' }
2016-02-05_09:13:19.06823 [http-client] 2016/02/05 09:13:19 DEBUG - status function called
2016-02-05_09:13:19.06824 [http-client] 2016/02/05 09:13:19 DEBUG - Monit request: url='http://127.0.0.1:2822/_status2?format=xml' body=''
2016-02-05_09:13:19.06825 [attemptRetryStrategy] 2016/02/05 09:13:19 DEBUG - Making attempt #0
2016-02-05_09:13:19.06826 [clientRetryable] 2016/02/05 09:13:19 DEBUG - [requestID=9e269313-9096-4dea-4b68-11ec69d901e2] Requesting (attempt=1): Request{ Method: 'GET', URL: 'http://127.0.0.1:2822/_status2?format=xml' }
2016-02-05_09:13:19.06830 [clientRetryable] 2016/02/05 09:13:19 DEBUG - [requestID=9e269313-9096-4dea-4b68-11ec69d901e2] Request succeeded (attempts=1), response: Response{ StatusCode: 200, Status: '200 OK' }
2016-02-05_09:13:19.06831 [monitJobSupervisor] 2016/02/05 09:13:19 DEBUG - Starting service redis
2016-02-05_09:13:19.06832 [http-client] 2016/02/05 09:13:19 DEBUG - Monit request: url='http://127.0.0.1:2822/redis' body='action=start'
2016-02-05_09:13:19.06834 [attemptRetryStrategy] 2016/02/05 09:13:19 DEBUG - Making attempt #0
2016-02-05_09:13:19.06836 [clientRetryable] 2016/02/05 09:13:19 DEBUG - [requestID=cbfe37ec-86a7-4971-7039-096bc94e8986] Requesting (attempt=1): Request{ Method: 'POST', URL: 'http://127.0.0.1:2822/redis' }
2016-02-05_09:13:19.09004 [clientRetryable] 2016/02/05 09:13:19 DEBUG - [requestID=cbfe37ec-86a7-4971-7039-096bc94e8986] Request succeeded (attempts=1), response: Response{ StatusCode: 200, Status: '200 OK' }
2016-02-05_09:13:19.09005 [File System] 2016/02/05 09:13:19 DEBUG - Remove all /var/vcap/monit/stopped
2016-02-05_09:13:19.09009 [MBus Handler] 2016/02/05 09:13:19 INFO - Responding
2016-02-05_09:13:19.09011 [MBus Handler] 2016/02/05 09:13:19 DEBUG - Payload
2016-02-05_09:13:19.09013 ********************
2016-02-05_09:13:19.09014 {"value":"started"}
2016-02-05_09:13:19.09016 ********************
2016-02-05_09:13:19.10237 2016/02/05 09:13:19 mail from: "monit@localhost"
2016-02-05_09:13:19.14515 [agent] 2016/02/05 09:13:19 DEBUG - Ignored monit event: %!(EXTRA string=Action done)
2016-02-05_09:13:19.16941 2016/02/05 09:13:19 mail from: "monit@localhost"
2016-02-05_09:13:19.21869 [agent] 2016/02/05 09:13:19 DEBUG - Ignored monit event: %!(EXTRA string=Exists)
2016-02-05_09:13:20.09777 [MBus Handler] 2016/02/05 09:13:20 INFO - Received request with action get_state
2016-02-05_09:13:20.09823 [MBus Handler] 2016/02/05 09:13:20 DEBUG - Payload
2016-02-05_09:13:20.09825 ********************
2016-02-05_09:13:20.09854 {"protocol":2,"method":"get_state","arguments":[],"reply_to":"director.bdfa0a50-23b6-4947-92e1-90b6eef26b8e.9499cc4d-2753-4fed-a0da-36bc46c48bd8"}
2016-02-05_09:13:20.09855 ********************
2016-02-05_09:13:20.09856 [Action Dispatcher] 2016/02/05 09:13:20 INFO - Running sync action get_state
2016-02-05_09:13:20.09912 [File System] 2016/02/05 09:13:20 DEBUG - Checking if file exists /var/vcap/bosh/spec.json
2016-02-05_09:13:20.09971 [File System] 2016/02/05 09:13:20 DEBUG - Reading file /var/vcap/bosh/spec.json
2016-02-05_09:13:20.09990 [File System] 2016/02/05 09:13:20 DEBUG - Read content
2016-02-05_09:13:20.09990 ********************
2016-02-05_09:13:20.09990 {"properties":{"logging":{"max_log_file_size":""}},"job":{"name":"redis_leader_z1","release":"","template":"redis","version":"1c917ee3b8d3639bf788777602fa4dc74647dcdf","sha1":"d6cc4965cbd7ba2eb8ca1932205b4d6bd16ef168","blobstore_id":"dceaec35-cf06-43b5-92ff-9d8982d5a82e","templates":[{"name":"redis","version":"1c917ee3b8d3639bf788777602fa4dc74647dcdf","sha1":"d6cc4965cbd7ba2eb8ca1932205b4d6bd16ef168","blobstore_id":"dceaec35-cf06-43b5-92ff-9d8982d5a82e"}]},"packages":{"redis-server":{"name":"redis-server","version":"b53d5357ab95a74c9489cd98a024e6ef6047aba0.1","sha1":"3ecef141a5da0f4889d37368f3bc6d844f76feb9","blobstore_id":"2b9a26df-91d7-46cb-7adc-7a2e13e9a7d4"}},"configuration_hash":"32fa4ba645f6875371db43a43cb2029898276c92","networks":{"redis1":{"cloud_properties":{"name":"random"},"default":["dns","gateway"],"dns_record_name":"0.redis-leader-z1.redis1.redis-warden.bosh","ip":"10.244.2.2","netmask":"255.255.255.252"}},"resource_pool":{"cloud_properties":{"name":"random"},"name":"small_z1","stemcell":{"name":"bosh-warden-boshlite-ubuntu-trusty-go_agent","version":"3147"}},"deployment":"redis-warden","index":0,"id":"","persistent_disk":4096,"rendered_templates_archive":{"sha1":"0a0b6cba58fcb64720ac9ef094787983741fe95e","blobstore_id":"2dd5fd15-ccdc-446a-a248-981bfecef8ca"}}
2016-02-05_09:13:20.10044 ********************
2016-02-05_09:13:20.10081 [http-client] 2016/02/05 09:13:20 DEBUG - status function called
2016-02-05_09:13:20.10420 [http-client] 2016/02/05 09:13:20 DEBUG - Monit request: url='http://127.0.0.1:2822/_status2?format=xml' body=''
2016-02-05_09:13:20.11572 [attemptRetryStrategy] 2016/02/05 09:13:20 DEBUG - Making attempt #0
2016-02-05_09:13:20.11574 [clientRetryable] 2016/02/05 09:13:20 DEBUG - [requestID=f3c12d4e-7552-4b89-53c1-7fe3ff8e9f7c] Requesting (attempt=1): Request{ Method: 'GET', URL: 'http://127.0.0.1:2822/_status2?format=xml' }
2016-02-05_09:13:20.11575 [clientRetryable] 2016/02/05 09:13:20 DEBUG - [requestID=f3c12d4e-7552-4b89-53c1-7fe3ff8e9f7c] Request succeeded (attempts=1), response: Response{ StatusCode: 200, Status: '200 OK' }
2016-02-05_09:13:20.11577 [monitJobSupervisor] 2016/02/05 09:13:20 DEBUG - Getting monit status
2016-02-05_09:13:20.11578 [http-client] 2016/02/05 09:13:20 DEBUG - status function called
2016-02-05_09:13:20.11579 [http-client] 2016/02/05 09:13:20 DEBUG - Monit request: url='http://127.0.0.1:2822/_status2?format=xml' body=''
2016-02-05_09:13:20.11581 [attemptRetryStrategy] 2016/02/05 09:13:20 DEBUG - Making attempt #0
2016-02-05_09:13:20.11581 [clientRetryable] 2016/02/05 09:13:20 DEBUG - [requestID=e171be7f-71f4-4de2-7919-143114da6dc8] Requesting (attempt=1): Request{ Method: 'GET', URL: 'http://127.0.0.1:2822/_status2?format=xml' }
2016-02-05_09:13:20.11583 [clientRetryable] 2016/02/05 09:13:20 DEBUG - [requestID=e171be7f-71f4-4de2-7919-143114da6dc8] Request succeeded (attempts=1), response: Response{ StatusCode: 200, Status: '200 OK' }
2016-02-05_09:13:20.11804 [File System] 2016/02/05 09:13:20 DEBUG - Checking if file exists /var/vcap/monit/stopped
2016-02-05_09:13:20.11810 [File System] 2016/02/05 09:13:20 DEBUG - Reading file /var/vcap/bosh/log/ntpdate.out
2016-02-05_09:13:20.11815 [MBus Handler] 2016/02/05 09:13:20 INFO - Responding
2016-02-05_09:13:20.11818 [MBus Handler] 2016/02/05 09:13:20 DEBUG - Payload
2016-02-05_09:13:20.11821 ********************
2016-02-05_09:13:20.11824 {"value":{"properties":{"logging":{"max_log_file_size":""}},"job":{"name":"redis_leader_z1","release":"","template":"redis","version":"1c917ee3b8d3639bf788777602fa4dc74647dcdf","sha1":"d6cc4965cbd7ba2eb8ca1932205b4d6bd16ef168","blobstore_id":"dceaec35-cf06-43b5-92ff-9d8982d5a82e","templates":[{"name":"redis","version":"1c917ee3b8d3639bf788777602fa4dc74647dcdf","sha1":"d6cc4965cbd7ba2eb8ca1932205b4d6bd16ef168","blobstore_id":"dceaec35-cf06-43b5-92ff-9d8982d5a82e"}]},"packages":{"redis-server":{"name":"redis-server","version":"b53d5357ab95a74c9489cd98a024e6ef6047aba0.1","sha1":"3ecef141a5da0f4889d37368f3bc6d844f76feb9","blobstore_id":"2b9a26df-91d7-46cb-7adc-7a2e13e9a7d4"}},"configuration_hash":"32fa4ba645f6875371db43a43cb2029898276c92","networks":{"redis1":{"cloud_properties":{"name":"random"},"default":["dns","gateway"],"dns_record_name":"0.redis-leader-z1.redis1.redis-warden.bosh","ip":"10.244.2.2","netmask":"255.255.255.252"}},"resource_pool":{"cloud_properties":{"name":"random"},"name":"small_z1","stemcell":{"name":"bosh-warden-boshlite-ubuntu-trusty-go_agent","version":"3147"}},"deployment":"redis-warden","index":0,"id":"","persistent_disk":4096,"rendered_templates_archive":{"sha1":"0a0b6cba58fcb64720ac9ef094787983741fe95e","blobstore_id":"2dd5fd15-ccdc-446a-a248-981bfecef8ca"},"agent_id":"fa299ad9-f868-481b-a63e-c45522fafe8c","bosh_protocol":"1","job_state":"running","processes":[{"name":"redis","state":"running","uptime":{"secs":5},"mem":{"kb":8764,"percent":0.1},"cpu":{"total":0}}],"vm":{"name":"99860a26-62d3-464e-4659-41688467f8d2"},"ntp":{"message":"file missing"}}}
2016-02-05_09:13:20.11836 ********************
2016-02-05_09:13:55.33726 [agent] 2016/02/05 09:13:55 DEBUG - Building heartbeat
2016-02-05_09:13:55.33740 [File System] 2016/02/05 09:13:55 DEBUG - Checking if file exists /var/vcap/bosh/spec.json
2016-02-05_09:13:55.33741 [File System] 2016/02/05 09:13:55 DEBUG - Reading file /var/vcap/bosh/spec.json
2016-02-05_09:13:55.33752 [File System] 2016/02/05 09:13:55 DEBUG - Read content
2016-02-05_09:13:55.33753 ********************
2016-02-05_09:13:55.33754 {"properties":{"logging":{"max_log_file_size":""}},"job":{"name":"redis_leader_z1","release":"","template":"redis","version":"1c917ee3b8d3639bf788777602fa4dc74647dcdf","sha1":"d6cc4965cbd7ba2eb8ca1932205b4d6bd16ef168","blobstore_id":"dceaec35-cf06-43b5-92ff-9d8982d5a82e","templates":[{"name":"redis","version":"1c917ee3b8d3639bf788777602fa4dc74647dcdf","sha1":"d6cc4965cbd7ba2eb8ca1932205b4d6bd16ef168","blobstore_id":"dceaec35-cf06-43b5-92ff-9d8982d5a82e"}]},"packages":{"redis-server":{"name":"redis-server","version":"b53d5357ab95a74c9489cd98a024e6ef6047aba0.1","sha1":"3ecef141a5da0f4889d37368f3bc6d844f76feb9","blobstore_id":"2b9a26df-91d7-46cb-7adc-7a2e13e9a7d4"}},"configuration_hash":"32fa4ba645f6875371db43a43cb2029898276c92","networks":{"redis1":{"cloud_properties":{"name":"random"},"default":["dns","gateway"],"dns_record_name":"0.redis-leader-z1.redis1.redis-warden.bosh","ip":"10.244.2.2","netmask":"255.255.255.252"}},"resource_pool":{"cloud_properties":{"name":"random"},"name":"small_z1","stemcell":{"name":"bosh-warden-boshlite-ubuntu-trusty-go_agent","version":"3147"}},"deployment":"redis-warden","index":0,"id":"","persistent_disk":4096,"rendered_templates_archive":{"sha1":"0a0b6cba58fcb64720ac9ef094787983741fe95e","blobstore_id":"2dd5fd15-ccdc-446a-a248-981bfecef8ca"}}
2016-02-05_09:13:55.33768 ********************
2016-02-05_09:13:55.33769 [monitJobSupervisor] 2016/02/05 09:13:55 DEBUG - Getting monit status
2016-02-05_09:13:55.33770 [http-client] 2016/02/05 09:13:55 DEBUG - status function called
2016-02-05_09:13:55.33780 [http-client] 2016/02/05 09:13:55 DEBUG - Monit request: url='http://127.0.0.1:2822/_status2?format=xml' body=''
2016-02-05_09:13:55.33781 [attemptRetryStrategy] 2016/02/05 09:13:55 DEBUG - Making attempt #0
2016-02-05_09:13:55.33782 [clientRetryable] 2016/02/05 09:13:55 DEBUG - [requestID=6cfdbfd8-996b-4e69-7260-55ae097907b8] Requesting (attempt=1): Request{ Method: 'GET', URL: 'http://127.0.0.1:2822/_status2?format=xml' }
2016-02-05_09:13:55.35516 [clientRetryable] 2016/02/05 09:13:55 DEBUG - [requestID=6cfdbfd8-996b-4e69-7260-55ae097907b8] Request succeeded (attempts=1), response: Response{ StatusCode: 200, Status: '200 OK' }
2016-02-05_09:13:55.36056 [File System] 2016/02/05 09:13:55 DEBUG - Checking if file exists /var/vcap/monit/stopped
2016-02-05_09:13:55.36058 [NATS Handler] 2016/02/05 09:13:55 INFO - Sending hm message 'heartbeat'
2016-02-05_09:13:55.36059 [NATS Handler] 2016/02/05 09:13:55 DEBUG - Message Payload
2016-02-05_09:13:55.36060 ********************
2016-02-05_09:13:55.36061 {"job":"redis_leader_z1","index":0,"job_state":"running","vitals":{"cpu":{"sys":"14.2","user":"12.5","wait":"0.2"},"disk":{"ephemeral":{"inode_percent":"20","percent":"47"},"persistent":{"inode_percent":"0","percent":"0"},"system":{"inode_percent":"20","percent":"47"}},"load":["2.21","1.69","1.40"],"mem":{"kb":"1886428","percent":"31"},"swap":{"kb":"136388","percent":"13"}},"node_id":""}
2016-02-05_09:13:55.36063 ********************
2016-02-05_09:14:31.86814 [agent] 2016/02/05 09:14:31 DEBUG - Ignored ssh event: %!(EXTRA string=error: Could not load host key: /etc/ssh/ssh_host_ecdsa_key)
2016-02-05_09:14:31.86824 [agent] 2016/02/05 09:14:31 DEBUG - Ignored ssh event: %!(EXTRA string=error: Could not load host key: /etc/ssh/ssh_host_ed25519_key)
2016-02-05_09:14:38.70790 [NATS Handler] 2016/02/05 09:14:38 INFO - Sending hm message 'alert'
2016-02-05_09:14:38.70793 [NATS Handler] 2016/02/05 09:14:38 DEBUG - Message Payload
2016-02-05_09:14:38.70794 ********************
2016-02-05_09:14:38.70795 {"id":"d2edce92-1530-4d52-6c1a-34725fa0a927","severity":4,"title":"SSH Login","summary":"Accepted password for vcap from 10.244.2.1 port 48318 ssh2","created_at":1454663678}
2016-02-05_09:14:38.70798 ********************
2016-02-05_09:14:38.71425 [agent] 2016/02/05 09:14:38 DEBUG - Ignored ssh event: %!(EXTRA string=pam_unix(sshd:session): session opened for user vcap by (uid=0))
2016-02-05_09:14:55.33263 [agent] 2016/02/05 09:14:55 DEBUG - Building heartbeat
2016-02-05_09:14:55.33445 [File System] 2016/02/05 09:14:55 DEBUG - Checking if file exists /var/vcap/bosh/spec.json
2016-02-05_09:14:55.33463 [File System] 2016/02/05 09:14:55 DEBUG - Reading file /var/vcap/bosh/spec.json
2016-02-05_09:14:55.33625 [File System] 2016/02/05 09:14:55 DEBUG - Read content
2016-02-05_09:14:55.33628 ********************
2016-02-05_09:14:55.33631 {"properties":{"logging":{"max_log_file_size":""}},"job":{"name":"redis_leader_z1","release":"","template":"redis","version":"1c917ee3b8d3639bf788777602fa4dc74647dcdf","sha1":"d6cc4965cbd7ba2eb8ca1932205b4d6bd16ef168","blobstore_id":"dceaec35-cf06-43b5-92ff-9d8982d5a82e","templates":[{"name":"redis","version":"1c917ee3b8d3639bf788777602fa4dc74647dcdf","sha1":"d6cc4965cbd7ba2eb8ca1932205b4d6bd16ef168","blobstore_id":"dceaec35-cf06-43b5-92ff-9d8982d5a82e"}]},"packages":{"redis-server":{"name":"redis-server","version":"b53d5357ab95a74c9489cd98a024e6ef6047aba0.1","sha1":"3ecef141a5da0f4889d37368f3bc6d844f76feb9","blobstore_id":"2b9a26df-91d7-46cb-7adc-7a2e13e9a7d4"}},"configuration_hash":"32fa4ba645f6875371db43a43cb2029898276c92","networks":{"redis1":{"cloud_properties":{"name":"random"},"default":["dns","gateway"],"dns_record_name":"0.redis-leader-z1.redis1.redis-warden.bosh","ip":"10.244.2.2","netmask":"255.255.255.252"}},"resource_pool":{"cloud_properties":{"name":"random"},"name":"small_z1","stemcell":{"name":"bosh-warden-boshlite-ubuntu-trusty-go_agent","version":"3147"}},"deployment":"redis-warden","index":0,"id":"","persistent_disk":4096,"rendered_templates_archive":{"sha1":"0a0b6cba58fcb64720ac9ef094787983741fe95e","blobstore_id":"2dd5fd15-ccdc-446a-a248-981bfecef8ca"}}
2016-02-05_09:14:55.33640 ********************
2016-02-05_09:14:55.33704 [monitJobSupervisor] 2016/02/05 09:14:55 DEBUG - Getting monit status
2016-02-05_09:14:55.33775 [http-client] 2016/02/05 09:14:55 DEBUG - status function called
2016-02-05_09:14:55.33776 [http-client] 2016/02/05 09:14:55 DEBUG - Monit request: url='http://127.0.0.1:2822/_status2?format=xml' body=''
2016-02-05_09:14:55.33847 [attemptRetryStrategy] 2016/02/05 09:14:55 DEBUG - Making attempt #0
2016-02-05_09:14:55.33848 [clientRetryable] 2016/02/05 09:14:55 DEBUG - [requestID=25e2f997-e914-4205-7af5-d7b07c6673d2] Requesting (attempt=1): Request{ Method: 'GET', URL: 'http://127.0.0.1:2822/_status2?format=xml' }
2016-02-05_09:14:55.34066 [clientRetryable] 2016/02/05 09:14:55 DEBUG - [requestID=25e2f997-e914-4205-7af5-d7b07c6673d2] Request succeeded (attempts=1), response: Response{ StatusCode: 200, Status: '200 OK' }
2016-02-05_09:14:55.34108 [File System] 2016/02/05 09:14:55 DEBUG - Checking if file exists /var/vcap/monit/stopped
2016-02-05_09:14:55.34196 [NATS Handler] 2016/02/05 09:14:55 INFO - Sending hm message 'heartbeat'
2016-02-05_09:14:55.34197 [NATS Handler] 2016/02/05 09:14:55 DEBUG - Message Payload
2016-02-05_09:14:55.34200 ********************
2016-02-05_09:14:55.34200 {"job":"redis_leader_z1","index":0,"job_state":"running","vitals":{"cpu":{"sys":"13.7","user":"6.2","wait":"0.1"},"disk":{"ephemeral":{"inode_percent":"20","percent":"47"},"persistent":{"inode_percent":"0","percent":"0"},"system":{"inode_percent":"20","percent":"47"}},"load":["2.13","1.74","1.43"],"mem":{"kb":"1891988","percent":"31"},"swap":{"kb":"136388","percent":"13"}},"node_id":""}
2016-02-05_09:14:55.34202 ********************
2016-02-05_09:15:06.05431 [agent] 2016/02/05 09:15:06 DEBUG - Ignored ssh event: %!(EXTRA string=pam_unix(sudo:auth): authentication failure; logname=vcap uid=1000 euid=0 tty=/dev/pts/0 ruser=vcap rhost=  user=vcap)
2016-02-05_09:15:10.30830 [agent] 2016/02/05 09:15:10 DEBUG - Ignored ssh event: %!(EXTRA string=pam_unix(sudo:auth): conversation failed)
2016-02-05_09:15:10.30846 [agent] 2016/02/05 09:15:10 DEBUG - Ignored ssh event: %!(EXTRA string=pam_unix(sudo:auth): auth could not identify password for [vcap])
2016-02-05_09:15:10.30933 [agent] 2016/02/05 09:15:10 DEBUG - Ignored ssh event: %!(EXTRA string=vcap : 1 incorrect password attempt ; TTY=pts/0 ; PWD=/home/vcap ; USER=root ; COMMAND=/usr/bin/tail -f -n 200 /var/vcap/bosh/log/current)
2016-02-05_09:15:15.57380 [agent] 2016/02/05 09:15:15 DEBUG - Ignored ssh event: %!(EXTRA string=pam_unix(sudo:auth): authentication failure; logname=vcap uid=1000 euid=0 tty=/dev/pts/0 ruser=vcap rhost=  user=vcap)
2016-02-05_09:15:22.60187 [agent] 2016/02/05 09:15:22 DEBUG - Ignored ssh event: %!(EXTRA string=vcap : TTY=pts/0 ; PWD=/home/vcap ; USER=root ; COMMAND=/bin/su -)
2016-02-05_09:15:22.60266 [agent] 2016/02/05 09:15:22 DEBUG - Ignored ssh event: %!(EXTRA string=pam_unix(sudo:session): session opened for user root by vcap(uid=0))
2016-02-05_09:15:22.60999 [agent] 2016/02/05 09:15:22 DEBUG - Ignored ssh event: %!(EXTRA string=Successful su for root by root)
2016-02-05_09:15:22.61002 [agent] 2016/02/05 09:15:22 DEBUG - Ignored ssh event: %!(EXTRA string=+ /dev/pts/0 root:root)
2016-02-05_09:15:22.61053 [agent] 2016/02/05 09:15:22 DEBUG - Ignored ssh event: %!(EXTRA string=pam_unix(su:session): session opened for user root by vcap(uid=0))
2016-02-05_09:15:24.93199 [agent] 2016/02/05 09:15:24 DEBUG - Ignored ssh event: %!(EXTRA string=root : TTY=pts/0 ; PWD=/root ; USER=root ; COMMAND=/usr/bin/tail -f -n 200 /var/vcap/bosh/log/current)
2016-02-05_09:15:24.93247 [agent] 2016/02/05 09:15:24 DEBUG - Ignored ssh event: %!(EXTRA string=pam_unix(sudo:session): session opened for user root by vcap(uid=0))
2016-02-05_09:15:55.35535 [agent] 2016/02/05 09:15:55 DEBUG - Building heartbeat
2016-02-05_09:15:55.35641 [File System] 2016/02/05 09:15:55 DEBUG - Checking if file exists /var/vcap/bosh/spec.json
2016-02-05_09:15:55.35643 [File System] 2016/02/05 09:15:55 DEBUG - Reading file /var/vcap/bosh/spec.json
2016-02-05_09:15:55.35645 [File System] 2016/02/05 09:15:55 DEBUG - Read content
2016-02-05_09:15:55.35646 ********************
2016-02-05_09:15:55.35646 {"properties":{"logging":{"max_log_file_size":""}},"job":{"name":"redis_leader_z1","release":"","template":"redis","version":"1c917ee3b8d3639bf788777602fa4dc74647dcdf","sha1":"d6cc4965cbd7ba2eb8ca1932205b4d6bd16ef168","blobstore_id":"dceaec35-cf06-43b5-92ff-9d8982d5a82e","templates":[{"name":"redis","version":"1c917ee3b8d3639bf788777602fa4dc74647dcdf","sha1":"d6cc4965cbd7ba2eb8ca1932205b4d6bd16ef168","blobstore_id":"dceaec35-cf06-43b5-92ff-9d8982d5a82e"}]},"packages":{"redis-server":{"name":"redis-server","version":"b53d5357ab95a74c9489cd98a024e6ef6047aba0.1","sha1":"3ecef141a5da0f4889d37368f3bc6d844f76feb9","blobstore_id":"2b9a26df-91d7-46cb-7adc-7a2e13e9a7d4"}},"configuration_hash":"32fa4ba645f6875371db43a43cb2029898276c92","networks":{"redis1":{"cloud_properties":{"name":"random"},"default":["dns","gateway"],"dns_record_name":"0.redis-leader-z1.redis1.redis-warden.bosh","ip":"10.244.2.2","netmask":"255.255.255.252"}},"resource_pool":{"cloud_properties":{"name":"random"},"name":"small_z1","stemcell":{"name":"bosh-warden-boshlite-ubuntu-trusty-go_agent","version":"3147"}},"deployment":"redis-warden","index":0,"id":"","persistent_disk":4096,"rendered_templates_archive":{"sha1":"0a0b6cba58fcb64720ac9ef094787983741fe95e","blobstore_id":"2dd5fd15-ccdc-446a-a248-981bfecef8ca"}}
2016-02-05_09:15:55.35649 ********************
2016-02-05_09:15:55.35659 [monitJobSupervisor] 2016/02/05 09:15:55 DEBUG - Getting monit status
2016-02-05_09:15:55.35661 [http-client] 2016/02/05 09:15:55 DEBUG - status function called
2016-02-05_09:15:55.35664 [http-client] 2016/02/05 09:15:55 DEBUG - Monit request: url='http://127.0.0.1:2822/_status2?format=xml' body=''
2016-02-05_09:15:55.35665 [attemptRetryStrategy] 2016/02/05 09:15:55 DEBUG - Making attempt #0
2016-02-05_09:15:55.35668 [clientRetryable] 2016/02/05 09:15:55 DEBUG - [requestID=18e8bb01-94ba-4568-6baa-6fead036a83f] Requesting (attempt=1): Request{ Method: 'GET', URL: 'http://127.0.0.1:2822/_status2?format=xml' }
2016-02-05_09:15:55.36115 [clientRetryable] 2016/02/05 09:15:55 DEBUG - [requestID=18e8bb01-94ba-4568-6baa-6fead036a83f] Request succeeded (attempts=1), response: Response{ StatusCode: 200, Status: '200 OK' }
2016-02-05_09:15:55.36268 [File System] 2016/02/05 09:15:55 DEBUG - Checking if file exists /var/vcap/monit/stopped
2016-02-05_09:15:55.36412 [NATS Handler] 2016/02/05 09:15:55 INFO - Sending hm message 'heartbeat'
2016-02-05_09:15:55.36513 [NATS Handler] 2016/02/05 09:15:55 DEBUG - Message Payload
2016-02-05_09:15:55.36514 ********************
2016-02-05_09:15:55.36515 {"job":"redis_leader_z1","index":0,"job_state":"running","vitals":{"cpu":{"sys":"12.9","user":"5.4","wait":"0.2"},"disk":{"ephemeral":{"inode_percent":"20","percent":"47"},"persistent":{"inode_percent":"0","percent":"0"},"system":{"inode_percent":"20","percent":"47"}},"load":["1.55","1.65","1.42"],"mem":{"kb":"1897056","percent":"31"},"swap":{"kb":"136388","percent":"13"}},"node_id":""}
2016-02-05_09:15:55.36524 ********************
2016-02-05_09:16:55.33271 [agent] 2016/02/05 09:16:55 DEBUG - Building heartbeat
2016-02-05_09:16:55.33676 [File System] 2016/02/05 09:16:55 DEBUG - Checking if file exists /var/vcap/bosh/spec.json
2016-02-05_09:16:55.33818 [File System] 2016/02/05 09:16:55 DEBUG - Reading file /var/vcap/bosh/spec.json
2016-02-05_09:16:55.33871 [File System] 2016/02/05 09:16:55 DEBUG - Read content
2016-02-05_09:16:55.33873 ********************
2016-02-05_09:16:55.33874 {"properties":{"logging":{"max_log_file_size":""}},"job":{"name":"redis_leader_z1","release":"","template":"redis","version":"1c917ee3b8d3639bf788777602fa4dc74647dcdf","sha1":"d6cc4965cbd7ba2eb8ca1932205b4d6bd16ef168","blobstore_id":"dceaec35-cf06-43b5-92ff-9d8982d5a82e","templates":[{"name":"redis","version":"1c917ee3b8d3639bf788777602fa4dc74647dcdf","sha1":"d6cc4965cbd7ba2eb8ca1932205b4d6bd16ef168","blobstore_id":"dceaec35-cf06-43b5-92ff-9d8982d5a82e"}]},"packages":{"redis-server":{"name":"redis-server","version":"b53d5357ab95a74c9489cd98a024e6ef6047aba0.1","sha1":"3ecef141a5da0f4889d37368f3bc6d844f76feb9","blobstore_id":"2b9a26df-91d7-46cb-7adc-7a2e13e9a7d4"}},"configuration_hash":"32fa4ba645f6875371db43a43cb2029898276c92","networks":{"redis1":{"cloud_properties":{"name":"random"},"default":["dns","gateway"],"dns_record_name":"0.redis-leader-z1.redis1.redis-warden.bosh","ip":"10.244.2.2","netmask":"255.255.255.252"}},"resource_pool":{"cloud_properties":{"name":"random"},"name":"small_z1","stemcell":{"name":"bosh-warden-boshlite-ubuntu-trusty-go_agent","version":"3147"}},"deployment":"redis-warden","index":0,"id":"","persistent_disk":4096,"rendered_templates_archive":{"sha1":"0a0b6cba58fcb64720ac9ef094787983741fe95e","blobstore_id":"2dd5fd15-ccdc-446a-a248-981bfecef8ca"}}
2016-02-05_09:16:55.33878 ********************
2016-02-05_09:16:55.33992 [monitJobSupervisor] 2016/02/05 09:16:55 DEBUG - Getting monit status
2016-02-05_09:16:55.34037 [http-client] 2016/02/05 09:16:55 DEBUG - status function called
2016-02-05_09:16:55.34060 [http-client] 2016/02/05 09:16:55 DEBUG - Monit request: url='http://127.0.0.1:2822/_status2?format=xml' body=''
2016-02-05_09:16:55.34110 [attemptRetryStrategy] 2016/02/05 09:16:55 DEBUG - Making attempt #0
2016-02-05_09:16:55.34133 [clientRetryable] 2016/02/05 09:16:55 DEBUG - [requestID=e5e59069-6db0-4673-6b5c-d1753a958cda] Requesting (attempt=1): Request{ Method: 'GET', URL: 'http://127.0.0.1:2822/_status2?format=xml' }
2016-02-05_09:16:55.35039 [clientRetryable] 2016/02/05 09:16:55 DEBUG - [requestID=e5e59069-6db0-4673-6b5c-d1753a958cda] Request succeeded (attempts=1), response: Response{ StatusCode: 200, Status: '200 OK' }
2016-02-05_09:16:55.35114 [File System] 2016/02/05 09:16:55 DEBUG - Checking if file exists /var/vcap/monit/stopped
2016-02-05_09:16:55.35125 [NATS Handler] 2016/02/05 09:16:55 INFO - Sending hm message 'heartbeat'
2016-02-05_09:16:55.35131 [NATS Handler] 2016/02/05 09:16:55 DEBUG - Message Payload
2016-02-05_09:16:55.35133 ********************
2016-02-05_09:16:55.35134 {"job":"redis_leader_z1","index":0,"job_state":"running","vitals":{"cpu":{"sys":"12.8","user":"5.7","wait":"0.2"},"disk":{"ephemeral":{"inode_percent":"20","percent":"47"},"persistent":{"inode_percent":"0","percent":"0"},"system":{"inode_percent":"20","percent":"47"}},"load":["1.54","1.62","1.42"],"mem":{"kb":"1895004","percent":"31"},"swap":{"kb":"136388","percent":"13"}},"node_id":""}
2016-02-05_09:16:55.35135 ********************
2016-02-05_09:17:55.33869 [agent] 2016/02/05 09:17:55 DEBUG - Building heartbeat
2016-02-05_09:17:55.34277 [File System] 2016/02/05 09:17:55 DEBUG - Checking if file exists /var/vcap/bosh/spec.json
2016-02-05_09:17:55.34280 [File System] 2016/02/05 09:17:55 DEBUG - Reading file /var/vcap/bosh/spec.json
2016-02-05_09:17:55.34282 [File System] 2016/02/05 09:17:55 DEBUG - Read content
2016-02-05_09:17:55.34282 ********************
2016-02-05_09:17:55.34282 {"properties":{"logging":{"max_log_file_size":""}},"job":{"name":"redis_leader_z1","release":"","template":"redis","version":"1c917ee3b8d3639bf788777602fa4dc74647dcdf","sha1":"d6cc4965cbd7ba2eb8ca1932205b4d6bd16ef168","blobstore_id":"dceaec35-cf06-43b5-92ff-9d8982d5a82e","templates":[{"name":"redis","version":"1c917ee3b8d3639bf788777602fa4dc74647dcdf","sha1":"d6cc4965cbd7ba2eb8ca1932205b4d6bd16ef168","blobstore_id":"dceaec35-cf06-43b5-92ff-9d8982d5a82e"}]},"packages":{"redis-server":{"name":"redis-server","version":"b53d5357ab95a74c9489cd98a024e6ef6047aba0.1","sha1":"3ecef141a5da0f4889d37368f3bc6d844f76feb9","blobstore_id":"2b9a26df-91d7-46cb-7adc-7a2e13e9a7d4"}},"configuration_hash":"32fa4ba645f6875371db43a43cb2029898276c92","networks":{"redis1":{"cloud_properties":{"name":"random"},"default":["dns","gateway"],"dns_record_name":"0.redis-leader-z1.redis1.redis-warden.bosh","ip":"10.244.2.2","netmask":"255.255.255.252"}},"resource_pool":{"cloud_properties":{"name":"random"},"name":"small_z1","stemcell":{"name":"bosh-warden-boshlite-ubuntu-trusty-go_agent","version":"3147"}},"deployment":"redis-warden","index":0,"id":"","persistent_disk":4096,"rendered_templates_archive":{"sha1":"0a0b6cba58fcb64720ac9ef094787983741fe95e","blobstore_id":"2dd5fd15-ccdc-446a-a248-981bfecef8ca"}}
2016-02-05_09:17:55.34286 ********************
2016-02-05_09:17:55.34344 [monitJobSupervisor] 2016/02/05 09:17:55 DEBUG - Getting monit status
2016-02-05_09:17:55.34388 [http-client] 2016/02/05 09:17:55 DEBUG - status function called
2016-02-05_09:17:55.34475 [http-client] 2016/02/05 09:17:55 DEBUG - Monit request: url='http://127.0.0.1:2822/_status2?format=xml' body=''
2016-02-05_09:17:55.34518 [attemptRetryStrategy] 2016/02/05 09:17:55 DEBUG - Making attempt #0
2016-02-05_09:17:55.35783 [clientRetryable] 2016/02/05 09:17:55 DEBUG - [requestID=6b72bd78-4426-48e8-527c-abbf9305e685] Requesting (attempt=1): Request{ Method: 'GET', URL: 'http://127.0.0.1:2822/_status2?format=xml' }
2016-02-05_09:17:55.36916 [clientRetryable] 2016/02/05 09:17:55 DEBUG - [requestID=6b72bd78-4426-48e8-527c-abbf9305e685] Request succeeded (attempts=1), response: Response{ StatusCode: 200, Status: '200 OK' }
2016-02-05_09:17:55.36968 [File System] 2016/02/05 09:17:55 DEBUG - Checking if file exists /var/vcap/monit/stopped
2016-02-05_09:17:55.36970 [NATS Handler] 2016/02/05 09:17:55 INFO - Sending hm message 'heartbeat'
2016-02-05_09:17:55.36971 [NATS Handler] 2016/02/05 09:17:55 DEBUG - Message Payload
2016-02-05_09:17:55.36972 ********************
2016-02-05_09:17:55.36974 {"job":"redis_leader_z1","index":0,"job_state":"running","vitals":{"cpu":{"sys":"13.9","user":"5.4","wait":"0.2"},"disk":{"ephemeral":{"inode_percent":"20","percent":"47"},"persistent":{"inode_percent":"0","percent":"0"},"system":{"inode_percent":"20","percent":"47"}},"load":["1.19","1.50","1.39"],"mem":{"kb":"1894848","percent":"31"},"swap":{"kb":"136388","percent":"13"}},"node_id":""}
2016-02-05_09:17:55.36975 ********************
dpb587-pivotal commented 7 years ago

Closing due to inactivity - you might want to try using the default health monitor intervals. Feel free to reopen with more details and logs if you're still having problems after trying that.