launchdarkly / node-server-sdk

LaunchDarkly Server-side SDK for Node
Other
79 stars 65 forks source link

Improve performance of feature_store.get #150

Closed mdgbayly closed 5 years ago

mdgbayly commented 5 years ago

We recently upgraded ldclient-node from 3.0.15 to 5.7.3 and we noticed a significant degradation in the performance of our application. We tracked this down to the changes made to the feature-store to support asynchronous APIs and the use of setTimeout specifically.

process.nextTick appears to perform much faster than setTimeout and marginally faster than setImmediate in environments we've tested in. (Docker running on 64bit Amazon Linux/2.12.10 with NodeJS 8.11.3)

This nodejs event loop blog post describes when to use setImmediate vs process.nextTick. Although it recommends setImmediate for most cases, in this case, process.nextTick feels more appropriate. https://nodejs.org/en/docs/guides/event-loop-timers-and-nexttick/#process-nexttick-vs-setimmediate

The relative performance of process.nextTick, setImmediate and setTimeout can be dependent on the hardware, OS, nodeJS version according to this benchmark post. In our testing with NodeJS 8.11.3 on AWS Linux, nextTick was noticeably faster than setImmediate and orders of magnitude faster than setTimeout.

https://gist.github.com/mmalecki/1257394

eli-darkly commented 5 years ago

Have you retested with this change to verify that it undoes the performance degradation you were seeing?

It's true that nextTick would be faster than setTimeout, but I'm actually not convinced that we even need to be using either one. I'm trying to reconstruct what our thinking was at the time, but I think this may have been an over-eager change; there was an earlier issue of not deferring callbacks to application code, but that's a different matter.

mdgbayly commented 5 years ago

Yes, I did retest. We run a fairly extensive performance test against a suite of APIs which all check a number of feature flags and then we calculate a global apdex score across those APIs. With the nextTick fix in place our apdex score went up by 10%. Incidentally, though we still feel like using Launch Darkly feature flags is for some reason adding unexpected overhead. I ran the same test with all our feature flag checks hardcoded to return true (effectively bypassing the ldclient-node lib completely) and our apdex rose by another 10%. I have not tried running the same test yet on v3.0.15 which we were using earlier.

I have not looked at the code in detail, but my assumption was that this async callback calling was introduced around the same time that the redis feature store was introduced as possibly the redis feature store would be using an async implementation and so this was needed to keep the interface consistent?

eli-darkly commented 5 years ago

my assumption was that this async callback calling was introduced around the same time that the redis feature store was introduced as possibly the redis feature store would be using an async implementation and so this was needed to keep the interface consistent

It's true that the feature store has to have callback semantics because it might be doing I/O, but it's been that way for a long time - that is not a change between v3.0.15 and v5.7.3. The difference is that in the in-memory implementation of the feature store, it used to just call the callback directly; later, that was changed to defer the callback instead. As I said, I am not actually sure that change was necessary, so I will look into that. But in the meantime I think you are right that nextTick would be better.

As for what else might be affecting performance: I can't comment on the specific figures you gave, since I don't know anything about how your application uses feature flags (like, how many flag evaluations you are doing for any given request). However, there have been a couple of changes since 3.0.15 that could have added some amount of overhead:

Off the top of my head I can't think of any others.

eli-darkly commented 5 years ago

Thanks again for the PR and for looking into this. I'm going to close this PR only because we're including some changes in the next Node SDK release (next week) that are for the same purpose, but go a bit further. We'll credit you with the fix in any case.

eli-darkly commented 4 years ago

This and other similar fixes were included in the 5.9.2 release.