Closed fatfisz closed 5 years ago
Sorry, I didn't notice that there was previously an issue created for that: https://github.com/sinonjs/lolex/issues/159. I guess there are now two people with inefficient tests π
I understand that without a benchmark it's hard to make speed comparisons, but even putting the performance aside, objects are not meant to be used as "hashmaps" instead of Map. This was always inefficient and that's why Maps (the actual hashmaps) were added and have been widely supported for some time now. At this point it's just a bad practice to use objects like this.
If there's really a need to support ES3, then I guess the only option for me is to fork. I'll still wait for your response, since I don't know if your requirements have changed in the past few months.
There is a diff I did working with a priority queue, that mostly worked but it was too risky to break by default.
Lolex has millions of downloads a month and I'm very hesitant to push for these sort of changes.
If you make a PR that keeps the existing implementation while turning on a queue based one based on an option/flag - I promise to review/add it as long as it doesn't break the existing API.
This was always inefficient and that's why Maps (the actual hashmaps) were added and have been widely supported for some time now.
Just a historical note: Maps were added because objects don't support non-string (or symbol) keys. Map
objects can have arbitrary object keys.
Objects as maps are still as fast as maps for hashmaps (at least in Node/V8).
Objects as maps are still as fast as maps for hashmaps (at least in Node/V8).
But my test has proven otherwise? Maybe my info is outdated, but I remember that extending objects or deleting properties resulted in internal classes being created, at least in V8, so with no properties ever being the same in lolex (property names are unique ids) a lot of stuff is going on under the hood. Or is it not the case now?
But my test has proven otherwise?
This is all OT - but generally V8 represents objects in two ways - "hidden classes" or hash maps. It calls them "slow mode" and "fast mode" and switches between them based on how the object behaves. I go over it in here (it's a slightly old answer but captures the gist of things).
When an object is very "dynamic" it behaves the same way a Map
does.
I implemented a Map version and also an Array version to have more comparison between different data structures. Below are the results of stress testing and the code I used to perform the test. TL;DR both Map and Array versions perform comparably, and the current version (called "default" in the tests) stands out and is the slowest. All of the versions seem to reach a linear-ish growth in test duration at some point, and from that we can deduce that the current version can be even 13 times slower than the alternatives in some scenarios. In the most basic scenario of a recurring timeout the current version is ~3 times slower.
I'd like to note that I didn't put much effort into implementing the Map and Array versions with the exception of implementing a "clever" fast removal of an item from an array, so this could possibly be further improved.
I'd like to also note that the slowdown from my original post was extremely highlighted because of a slower machine I'm using for work. Not everyone can use top-notch equiment and those are the cases where performance really matters.
So I've ended up preparing two new alternatives that perform comparably, with the map version being better in more extreme scenarios. OTOH the benefit of the array version is that old browsers will support it. What do you suggest I do at this point? Having three different versions is an overkill and I added more than one just for testing purposes in the first place. When you've made the decision I'll prepare a PR.
@fatfisz first of all - thanks for this!
Personally I'd prefer a priority queue version and the current implementation and being able to switch between them.
A map/array is nice, but switching to a priority queue makes sense to me and really helped our perf in Node :) Basically I suspect the majority use case is executing timers one by one in order up to a certain point for which a priority queue is ideal since timers are not inserted in order.
Note that your benchmark only inserts clocks "in order".
Ok, I'll do a priority queue version too. I didn't want to invest too much time into this and go for a quick win instead, but I have no problem with working on that more.
Btw. all versions are implemented next to each other and activated by flags (useMap
, useArray
). I'll just add a priority queue too. Do you have a specific data structure in mind that I should use as a priority queue or should I come up with something?
The most problematic thing in here is the firstTimerInRange
function, since if we want a fast lookup it would require a tree structure. But that's if we don't change the internals a bit... It looks to me that the only reason for the range lookup is the setSystemTime
method, which makes it possible to skip some timers and jump straight into the future without firing them. Calling clock.tick
then won't call timers from the past. However, this could also be achieved by moving past timers into another queue when setSystemTime
is called. Then clock.tick
would just use the "future" queue and clock.next
would use the "past" queue first and then the "future" queue. I think that by doing that clock.tick
would become a little bit simpler, since it wouldn't need to do time compensation thing at the beginning, since setSystemTime
would do it instead. This would also allow using heap and not a tree, which is considerably simpler in implementation, and so it's faster and less error-prone.
This all hinges on an assumption that setSystemTime
will be called much less frequently than clock.tick
or clock.next
, since setSystemTime
will need to go through all of the timers, and although it is already doing that in advanceTimers
, moving timers between the two queues might be a bit more time consuming.
Do you have a specific data structure in mind that I should use as a priority queue or should I come up with something?
There are multiple versions on NPM but you can also use the one we use at Node.
The most problematic thing in here is the firstTimerInRange function, since if we want a fast lookup it would require a tree structure.
A priority queue (implemented as a heap) is actually quite good at that.
This all hinges on an assumption that setSystemTime will be called much less frequently than clock.tick or clock.next
I think that's a safe and valid assumption.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Sorry for the lack of communication. I actually made 2 attempts at working on that, but every time I gave up because the way Lolex is structured makes it very uncomfortable for me to work on it (1 big file with everything in it). Also there were some other problems that would need to be solved anyway.
So I'll just close this issue, since at this point it's stale, as the bot said.
Is there any chance we could get the (lolex) code in the stress test above, a 10x change in lolex excution would change my life, even if I have to go bug-hunting. Any help extremely appreciated. @fatfisz
Same answer as #159 . Just have a look at a suitable priority queue and see if you can make it work. Ask questions that are not already answered by these two issues.
@fatso83 I actually found https://github.com/akhilkedia/lolex-queue/blob/master/fast-lolex.js and got it to work for my needs (the tick() method wasn't implemented (only next() was), and a few other methods it depends on, I implemented it, and it's now over 10 times faster.
@arthurwolf I had a branch with fake-timers using a priority queue but https://github.com/sinonjs/fake-timers/issues/159#issuecomment-390140199
For the "it was never slow for me", it definitely was for me (used to run a trading bot in "backtesting" mode instead of it's normal "live" mode, so spending a lot of time within lolex). About we don't have benchmarks, any timing will show a big difference between the two methods. The code I forked from (lolex-fast.js) even provides a testing script in their repo.
On Mon, Nov 29, 2021 at 6:52 AM Benjamin Gruenbaum @.***> wrote:
@arthurwolf https://github.com/arthurwolf I had a branch with fake-timers using a priority queue but #159 (comment) https://github.com/sinonjs/fake-timers/issues/159#issuecomment-390140199
β You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/sinonjs/fake-timers/issues/229#issuecomment-981312408, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAA2SFO5Q5UKPM2NWS54PS3UOMIKLANCNFSM4GRO3Y6A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
--
εζ°γ¨γ¦γΌγ’γ’
What actually happens
As in the title, Lolex is very slow, at least much slower than it could be. For a particular test in my project the current Lolex implementation takes 124s to complete on my machine. After some profiling and diagnosing the problem, I managed to trim that down to 22s.
The culprit is
clock.timers
: it's an object, but it's not an appropriate data structure for operations that Lolex performs on it. Profiling in Chrome showed that the slowest pieces of code in particular are thein
loops withhasOwnProperty
calls. UsingMap
forclock.timers
resulted in a tremendous speed up.How to reproduce
I have a piece of code that sets a new timeout a second after the function is called, something like this:
Then my test looks like so:
Basically there's a clock component and I want to test what is displayed at different moments in time. While the test may not be perfect (it tests the whole integration instead of display and step separately), it highlighted the problem.
Next steps
I'll gladly prepare a PR with the changes. From what I've managed to find out, the lowest supported version of Node for Lolex is 6, and it supports
Map
, so there should be no problem.