honeycombio / refinery

Refinery is a trace-aware tail-based sampling proxy. It examines whole traces and intelligently applies sampling decisions (whether to keep or discard) to each trace.
293 stars 93 forks source link

Make it more obvious when Refinery can't communicate to Honeycomb #812

Open cartermp opened 1 year ago

cartermp commented 1 year ago

For example, let's say the Honeycomb API is completely unreachable (like in a recent outage we had). Refinery could more prominently highlight, "Hey, I'm getting 500s from my upstream, and it's not you - it's Honeycomb".

Today, you can set up Metrics to tell you this. But not everyone running Honeycomb will do that, nor will they be looking at Refinery metrics.

Is there some other notification or...something that could be done that's a bit more obvious for this use case?

kentquirk commented 1 year ago

One possibility would be to create backpressure -- if refinery goes a certain time without being able to send anything upstream, instead of continuing to accept data, it could start returning, say, a 503 - Service Unavailable. The body of the response could include an explanation that this is due to an upstream problem and a link to the Honeycomb status page.

Does that sound plausible?

cartermp commented 1 year ago

Mmm, that could probably work I think.

kentquirk commented 11 months ago

This is going on hold until this resolves since it's going to make some modifications to Husky, and this work will want to use those.

VinozzZ commented 10 months ago

This husky PR should unblock this issue

kentquirk commented 10 months ago

We don't have time to fix and reliably test this in 2.3, so I'm moving it to 2.4.