Closed vongruenigen closed 5 months ago
After some more digging I saw that the problem didn't occur the first time in a 5.x
release, but rather in the release 4.4.7
. We didn't notice until now because we had version 4.4.5
pinned in our npm package.
Hey @vongruenigen, good catch. I notice and register your CLA signature. The PR looks good, I will do some investigation here before get the approval and merge it.
Thanks for catching the regression and push the changes.
Thanks a lot for the feedback and suggestions @bigmontz! I'll be on holiday for a week from today on but I'll try to find some spare time to incorporate your suggestions above. Will ping you again once I'm done.
@bigmontz thanks a lot for the swift review and merging. Really looking forward to the new release! 👍
Background
Lately, I've been upgrading a big business application that was using the latest 4.x version of the neo4j driver to use the latest 5.x version (we still use neo4j 4.x, but they are compatible according to docs). The upgrade itself was very smooth, but while testing everything afterwards, we noticed that (almost) all of our requests to the backend took considerably longer to finish (~2x).
After doing some investigation (mainly by using clinic flamegraphs) I noticed that there was a considerable increase in the time spent parsing the raw neo4j responses in the driver. Looking at it in more detail revealed that most of the increase stems from one particular codepath, namely from calls to
getTimeInZoneId
.Looking at it almost immediately revealed the culprit, which is how the
Intl
API is used there. It seems that a newIntl.DateTimeFormat
object is created for each date time returned in the response. TheIntl
API is notoriously slow afaik, hence we should reduce the usage of those APIs to an absolute minimum in hot code paths, such as response parsing. Also, since the application I was upgrading is basically doing nothing else than managing timestamps at its core, it made sense that we noticed the performance degradation in such a severe way.Changes in this MR
I decided to try out to cache the
DateTimeFormat
to prevent intializing the formatter for a given time zone more than once, and it seems to have helped quite a lot (in our case the "big" requests got a speedup of 60-70%). I also checked for other usages ofIntl
in the code base, but luckily only found one other place, where it's used to check the validity of a given timezone string. I added caching there as well, though I'm not entirely sure if this is a case of premature optimization, since we personally didn't run into performance issues where this particular method was involved. I'll leave this up to you guys to decide if we should include those changes in this MR as well, or revert them.Remarks/questions
getTimeInZoneId
method is extensively tested with the test suite already. Is this fine for you?Maybe there's a better way to achieve the same, and since I'm in no way an expert on the
Intl
API or this driver, I'm very open to suggestions and other ideas how to solve this (also because I'm a first time contributor, see below).Looking forward to hearing from you guys, hopefully we can get this resolved as quickly as possible.
Thanks in advance! :)
CLA
I'm a first time contributor and the issue template said that I need to mention it here somewhere. I signed the CLA, let me know in case anything else is needed from my side.