mde / timezone-js

DEPRECATED: Timezone-enabled JavaScript Date object. Uses Olson zoneinfo files for timezone data.
824 stars 182 forks source link

Speed considerations in case of changing timezone in a tight loop #28

Closed itsuryev closed 4 years ago

itsuryev commented 12 years ago

Let's say I have 500 appointments, each appointment have start and end dates (they are all timezoneJS.Date instances at this point). Now I change timezone using select and want to iterate over each appointment and switch timezone both to start and end dates. Selected timezone is the same for all events.

Basically:

var timezone = "SomeTimezone";
var events = [];
for (var i=0; i<events.length; i++) {
    var ev = events[i];
    ev.start_date.setTimezone(timezone);
    ev.end_date.setTimezone(timezone);
}

That happens to be quite slow. Is that expected or are there ways to improve it from my side?

The most time consuming functions are:

       Name                       Calls     %        Own time
1. convertRuleToExactDateAndTime  44760   41.56%    1660.881ms
2. parseTimeString                92502   25.47%    1018.099ms
3. getZone                        3000    6.44%     257.474ms
4. compareDates                   31092   6.31%     252.205ms
5. getRule                        3000    4.32%     172.64ms
6. findApplicableRules            5778    4.08%     163.247ms
longlho commented 12 years ago

Hmm I'll take a look at this. There might be optimizations to be done. Adjusting timezone means adjusting pretty much everything from the minutes up to years.

itsuryev commented 12 years ago

Maybe that's really far-fetched call but is it possible to cache results of the previous lookups (that way next conversion with the same Timezone would have to check less). Here I am talking not even about caching results of parseTimeString for example but something more. That would bring us to even more distant case of loading not simply parsed timezone data but that cache directly.

More real case - parseTimeString. I am not sure what kind of incoming strings it might get but I tried the following: reduced the number of events to 186 (excluded copy-paste ones). Then stored incoming parameters with the number of calls.

String    Number of times
0:        1488
0:00s:    2232
1:00:     1602
2:00:     2208
2:00s:    22260
3:00:     24
23:59:59: 6696

So here in perfect case scenario we would get only 7 calls (and others will be simple lookups in the cache).

Will keep looking if it helps.

longlho commented 12 years ago

I'll add some benchmark test cases but before that I just wanna clarify your use case: You have a bunch of time string in a certain timezone and you construct timezoneJS.Date(string, oldTz) then do a setTimezone to convert it to another timezone right?

longlho commented 12 years ago

I've added a benchmark test in commit 2bb8433 and made some improvements. Any clarifications on your use cases would be great.

Right now on my machine (Mac Mini Core i7 & 16Gb RAM, no SSD) takes around 2s to convert 5000 dates to a different timezone, in the latest revision it takes around 750ms.

longlho commented 12 years ago

There're definitely ways to further store processed values... looking into it

itsuryev commented 12 years ago

First of all thanks for your help :)

What time consuming operations we can do at all?

  1. Create timezoneJS.Date object using various means
  2. Convert them (through setTimezone or creating new one)

In my case I create bunch of events but I use UTC timezone so most likely that's the reason why I didn't notice that application is loading slow (as it isn't -> UTC is fast). Then I provide selectbox with list of timezones. User can change it's value and at this moment I iterate over all loaded events and change their timezones. Code is essentially the same as in my first message.

Your latest commit certainly improved performance making parseTimeString the weakest link. I've sent the pull request where it's result are cached and called only then needed. In my case it made everything faster. New results (that's with storing parsed time strings, same data set as in my first message but different computer):

       Name                       Calls     %        Own time
1. convertRuleToExactDateAndTime  44760   28.44%     245.664ms
2. compareDates                   31092   15.37%     132.795ms
3. findApplicableRules             5778   10.64%      91.88ms
4. getRule                         3000   10.53%      90.942ms
5. getZone                          3000   5.71%      49.351ms
6. getTzInfo                        3000   3.31%     28.613ms

Now parseTimeString is completely off the charts. Also it's quite amusing how small iteration (where we created short months and days in convertRuleToExactDateAndTime) but multiplied on ~45000 runs created such a delay.

longlho commented 12 years ago

Thanks for the pull request. Since both parseTimeString & convertRuleToExactDateAndTime are used to process tz I'm planning to store them in the tz data model itself. I'll definitely try to look into it this weekend.

itsuryev commented 12 years ago

Tried to cache results of convertRuleToExactDateAndTime (code updated). Too bad can't do it if prevRule was specified.

Speed results:

       Name                       Calls     %        Own time
1. compareDates                   31224    32.6%     227.884ms
2. getRule                         3000   14.35%     100.316ms
3. convertRuleToExactDateAndTime   5788    5.85%      40.884ms
4. findApplicableRules             5778    5.51%      38.521ms
5. getTzInfo                       3000    4.59%      32.056ms
6. convertDateToUTC               11556    4.09%      28.566ms

Another note about benchmark: , yearToMillis = 5 * 365 * 24 * 3600 * 1000 wider range of dates slows it quite considerably (up to actually 4 times in my case).

mde commented 12 years ago

Crap, I didn't see this discussion before I merged. I only looked at the auto-generated ticket for the pull-request, and the diff itself. @longlho, can you let me know if you need me to back this out?

mde commented 12 years ago

The only concern I had with this was adding to code-complexity in code that's already pretty hairy. But all the on-the-fly calculations make things pretty slow, so anything we can do to speed things up seems like a win. @longlho, is it worth accepting this pull and incrementally building on that?

longlho commented 12 years ago

I'm trying to store further processed values in the model itself. For ex zone America/New_York has an array of like 1988, Jun, 03 that can be converted to UTC millis for later use instead of doing it everytime.

If you can roll back the pull request while I investigate that'd be cool.

Long Ho Software Engineer Selerity

On Jun 23, 2012, at 4:58 PM, Matthew Eernissereply@reply.github.com wrote:

The only concern I had with this was adding to code-complexity in code that's already pretty hairy. But all the on-the-fly calculations make things pretty slow, so anything we can do to speed things up seems like a win. @longlho, is it worth accepting this pull and incrementally building on that?


Reply to this email directly or view it on GitHub: https://github.com/mde/timezone-js/issues/28#issuecomment-6528239

longlho commented 12 years ago

Actually we can build from the current master so don't worry about it :) Thanks @itsuryev :D

longlho commented 12 years ago

BTW @itsuryev what did you use to profile the calls?

itsuryev commented 12 years ago

Good old Firebug in Firefox.

  1. F12 -> Console -> Profile
  2. Execute action (change timezone select in my case or run benchmark code in console)
  3. Click on profile again

Results are displayed.

longlho commented 12 years ago

Ahh the performance on browser is significantly worse. I've been doing it on the server side that's why I haven't seen it. Thanks :)

longlho commented 12 years ago

Just pushed some more optimizations: I rm'ed the PARSED_TIME_STRING cache and shift the processing to initialization stage. Shaved like 30% time locally, on Travis it shaved like 50%.

The trade-off of course is memory footprint (its always a space vs. time issue).

Still optimizing :)

mde commented 12 years ago

Nice work on this. :)

itsuryev commented 12 years ago

Hello,

Good work! Though strangely enough it actually became slower:

       Name                       Calls     %        Own time
1. compareDates                   31136   50.87%     473.56ms
2. getRule                         3000   10.04%     93.468ms
3. convertRuleToExactDateAndTime   5762    8.16%     75.943ms
4. findApplicableRules             5778    7.63%     71.061ms
5. convertDateToUTC               11524    3.10%     28.849ms
6. getTzInfo                       3000    2.85%     26.518ms

Just to make sure that's not some changes in my enviroment I reverted back to my file and got the previous results. Same is for the test benchmark ~1300ms vs ~950ms (old).

Also was puzzled how come the speed of converting 5000 dates in behcnmark is about the same as 1000 in my sample (even considering that's node vs browser) and found out the timezones are not actually the same (doh! :) ). In my case I was converting "Europe/Minsk" <-> "Europe/London". And "Asia/Bangkok" <-> "Europe/Minsk" is quite faster. Latest version: Bangkok -> Minsk: ~1300ms London -> Minsk: ~2000ms That's not a problem at all but simply an explanation why numbers I kept providing are higher than those in benchmark.

thorsent commented 10 years ago

This is an old one but I just ran into it (iterating across a few thousand nodes in a stock chart). The bottleneck appears to be the use of sort() in order to determine whether the date lies inbetween the set of applicableRules. I'm wondering if anyone has made an attempt, or has thoughts, on optimizing this section.

My initial thought would be to use an iterative approach, searching through the ruleset for the position of the date rather than using the sorting & pinpoint strategy but there's a lot going on here and I am wary of edge cases. Also not sure if I can rely on rulesets being presorted.