Actual state save timeout in Listener

JimmyMa commented 8 years ago

The Listener logs has below error almost every 20 seconds: {"timestamp":1463115787.966745853,"process_id":23237,"source":"vcap.hm9000.listener","log_level":"info","message":"Saving Heartbeats - {\"Heartbeats to Save\":\"46\"}","data":null} {"timestamp":1463116020.814734697,"process_id":23237,"source":"vcap.hm9000.listener","log_level":"info","message":"Save took too long. Not bumping freshness.","data":null}

The cause is that the duration of ensureCacheIsReady (which is 20 seconds interval) is counted in the duration of SyncHeartbeats, that causes actual state timeout.

https://github.com/cloudfoundry/hm9000/blob/master/store/actual_state.go#L42

cf-gitbot commented 8 years ago

We have created an issue in Pivotal Tracker to manage this:

https://www.pivotaltracker.com/story/show/119571999

The labels on this github issue will be updated when the story is started.

fraenkel commented 8 years ago

We have made improvements to use etcd less which should improve the overall etcd usage. The code has been not been delivered to CF release but will be shortly.

cloudfoundry-attic / hm9000

Actual state save timeout in Listener #19