cloudfoundry-community / firehose-to-syslog

Send firehose events from Cloud Foundry to syslog.
MIT License
44 stars 58 forks source link

Refactor caching #192

Closed aeijdenberg closed 6 years ago

aeijdenberg commented 6 years ago

To be read in conjunction with a longer comment I'm about to leave in #181

aeijdenberg commented 6 years ago

We've been running this for a few weeks now without issue and have seen a big drop in CPU usage on our API servers.

Summary of changes:

  1. Load all apps before starting to read from nozzle. In this manner we won't fill our buffer by falling behind before we have apps populated.
  2. Instead of fetching all apps on scheduled basic, instead lazily re-fetch each app or space or org based on a randomized TTL.
  3. Allow for in-memory cache instead of bolt if not persisting the database (which by default bosh release is not, as the default path is not in a persistent location).
  4. Add options to strip app name suffixes. This is useful for blue/green and zero downtime pushes where apps typically have temporary suffixes during renames. Since this data is often incorrect anyway, we may as well strip it.
  5. Stop using cf-client for important API calls, as it seems to often want to get smart and make successive calls for extra info that we don't need, contributing to latency and CPU load on API servers.

I think that's mostly it. We could also considering bumping the default TTL for the cached data from 60 seconds to something much larger, as the only effect of this being cached is if an app/space/org is renamed, which I imagine is a fairly rare event.

shinji62 commented 6 years ago

Seems Test are failing

# github.com/cloudfoundry-community/firehose-to-syslog/caching_test
caching/caching_test.go:122:13: undefined: CachingBoltConfig
caching/caching_test.go:130:11: undefined: CachingBolt
caching/caching_test.go:137:17: undefined: NewCachingBolt
caching/caching_test.go:217:19: undefined: NewCachingBolt
caching/caching_test.go:229:19: undefined: NewCachingBolt
caching/caching_test.go:242:18: undefined: NewCachingBolt

Can fix that ? Thanks

aeijdenberg commented 6 years ago

Hi @shinji62 - yes, I'll work on the tests - apologies I was expecting a bit more pushback / discussion before merging. :)

While I think the code is reasonably robust, I hadn't gone to the effort to update tests yet and wanted to feedback that the approach was acceptable first.

shinji62 commented 6 years ago

ahaha ! Too much kindness in me ....

Approach is ok and I think is important to add fresh blood to the repos.

Once test pass I will check quickly again.

As no other dev are pending I will let into the dev. branch and reverse if needed.

NO STRESS

jmcarp commented 6 years ago

@aeijdenberg @shinji62: this looks great! Do you think this is stable enough to cut a new release, or does it need more testing/validation first?

shinji62 commented 6 years ago

Miss some testing and we should be good.

aeijdenberg commented 6 years ago

Hi @jmcarp - yes, I think it's fairly stable, but agree that at minimum the tests should pass (and for extra credit also test something useful).

I will try to find time early next week to do so.