Nordstrom / xrpc

Simple, production ready Java API server built on top of functional composition.
Apache License 2.0
17 stars 17 forks source link

Fix RendezvousHash concurrent NPE #208

Closed wboyle closed 5 years ago

wboyle commented 5 years ago

Resolves #203 -- many thanks to @manimaul for the report!

Context

~6% of all requests are currently failing to resolve due to a concurrency NPE in ServerRateLimiter's RendezvousHash. The NPE occurs whenever multiple methods reset and populate a RendezvousHash's internal hashmap at the same.

This MR resolves the issue by no longer using a shared map and instead creating new instances on each call. Also includes some small refactors for clarity and a new testcase to help prevent regressions. The new test fails consistently with the original RendezvousHash code.

You can see from the performance regression script that current master fails with socket errors in 1912 / 32720 requests while this branch fails 0 / 76753.

Regression on master:

screen shot 2018-11-29 at 6 52 26 pm

Regression with branch:

screen shot 2018-11-29 at 6 49 40 pm
jkinkead commented 5 years ago

Approving; feel free to update the method or not, as you see fit.