uber / tchannel

network multiplexing and framing protocol for RPC
MIT License
1.15k stars 129 forks source link

Larch reservoir sampled logger #1292

Closed rf closed 8 years ago

rf commented 8 years ago

@Raynos @jcorbin @kriskowal

To continue the discussion on the commit:

  1. Should we do straight reservoir sampling or do a scored-record sampling approach for the first impl? For v0 we'll do straight reservoir sampling just to see what affect it has on resource usage when generating a lot of error logs.
  2. Is violation of log ordering a problem? I think its not; the logs will be mis ordered per 50ms chunk and only if there are more than 500 logs in that chunk.
  3. My weird copy serialize optimization in record.js. It uses an object with the data that doesn't change and copies the meta fields into it. My benchmarks showed that to be faster, but I could remove the optimization for clarity and instead just copy all fields into a new object.
Raynos commented 8 years ago

@rf Let's do straight sampling for v0.

We want to verify that sampling makes error frames faster with our benchmarks.

I'm working on https://github.com/uber/tchannel/pull/1262

Raynos commented 8 years ago

I think we can keep the ordering semantics per worker somehow. But if we lose it is not a problem. Let's just get the ts correct on the LogRecord so we can re-order if needed.

Raynos commented 8 years ago

Lets get some tests in here too :)

Raynos commented 8 years ago

Let's add stats then ship.

Raynos commented 8 years ago

Fix lint.

Raynos commented 8 years ago

Let's also get a stat for everything we flush

In production I want to see the distribution of logs that are dropped and flushed by levelName.

Raynos commented 8 years ago

legit. please fix travis though.