movio / bramble

A federated GraphQL API gateway
https://movio.github.io/bramble/
MIT License
497 stars 55 forks source link

Remove service polling client keep alive #144

Closed lucianjon closed 2 years ago

lucianjon commented 2 years ago

Problem

We've noticed errors like (connection reset by peer or EOF) when bramble polls certain services, most notably anything built on nodejs. After some investigation it turns out this is due to servers having a shorter Keep-Alive than bramble's http client, which can cause race conditions.

Solution

Our first attempt was to introduce some randomness to the polling period with jitter. While this helped we still ran into the errors above.

It seems the most bullet proof way to not have race conditions is to have the client's Keep-Alive be shorter than the servers. Given service polling is not a high throughput task, we decided it's easier to just disable it entirely for this task. The http client bramble uses to run queries against downstream services is untouched.