grafana / tempo

Grafana Tempo is a high volume, minimal dependency distributed tracing backend.
https://grafana.com/oss/tempo/
GNU Affero General Public License v3.0
4.03k stars 522 forks source link

Generators: Drop out of the ring before stopping ingestion #4101

Closed joe-elliott closed 1 month ago

joe-elliott commented 1 month ago

What this PR does: Currently on shutdown the generators set a flag to stop receiving traffic and then remove themselves from the ring. This causes distributors to continue to send to a generator that is refusing traffic for a few seconds.

This PR: 1) Correctly drops out of the ring first and then sets the flag to stop ingestion 2) Adds a sleep in between these steps. Tempo by default uses memberlist for ring propagation which often requires a few seconds to propagate state from one component to another. Without this sleep the change was ineffective. I do not like the sleep but I'm not sure there's a better solution.

To the left of the red line is what a generator rollout looks like today. To the right is what this change looks like. Note that there are still a few drops but it's significantly reduced.

image

Checklist