moh-incom commented 11 months ago

The body of unitForToken creates a lot of objects, including a bunch of RegExps. When parsing a lot of strings using DateTime.fromFormat in the same locale, these objects are created from scratch every time even though they are identical each time. This simple PR caches the resulting unitate function for each locale on first use so these objects are only created once per locale.

I have carried out a simple benchmark using the below code:

const { DateTime } = require('luxon');

const TZ = 'UTC'; 

const str = '04/12/2023 20:20';

const count = 100_000;

const dts = [];

for (let index = 0; index < count; index++) {
  dts.push(DateTime.fromFormat(str, 'MM/dd/yyyy HH:mm', { zone: TZ }));
}
console.log(dts[count - 1].toISO());

My measurements indicate roughly a 2x speed-up on Node 18.18.0. This might result in a low to negligible increase in memory usage due to the few cached functions.

Edit

Benchmark results before and after:

Before (Luxon 3.4.4)

DateTime.local x 1,374,058 ops/sec ±1.07% (92 runs sampled)
DateTime.fromObject with locale x 474,771 ops/sec ±0.67% (97 runs sampled)
DateTime.local with numbers x 355,343 ops/sec ±0.68% (92 runs sampled)
DateTime.fromISO x 132,521 ops/sec ±5.31% (89 runs sampled)
DateTime.fromSQL x 206,776 ops/sec ±0.37% (98 runs sampled)
DateTime.fromFormat x 25,289 ops/sec ±0.52% (95 runs sampled)
DateTime.fromFormat no cache x 7,549 ops/sec ±0.83% (91 runs sampled)
DateTime.fromFormat with zone x 11,800 ops/sec ±0.50% (96 runs sampled)
DateTime.fromFormat with non-English locale x 3,464 ops/sec ±0.96% (90 runs sampled)
DateTime.fromFormat with non-English locale no cache x 2,032 ops/sec ±1.53% (88 runs sampled)
DateTime#setZone x 75,920 ops/sec ±0.73% (93 runs sampled)
DateTime#toFormat x 452,849 ops/sec ±0.46% (98 runs sampled)
DateTime#toFormat with macro x 167,433 ops/sec ±0.80% (90 runs sampled)
DateTime#toFormat with macro no cache x 5,370 ops/sec ±3.41% (79 runs sampled)
DateTime#add x 106,457 ops/sec ±38.67% (96 runs sampled)
DateTime#toISO x 3,377,444 ops/sec ±0.24% (95 runs sampled)
DateTime#toLocaleString x 217,329 ops/sec ±0.80% (92 runs sampled)
DateTime#toLocaleString in utc x 91,215 ops/sec ±1.26% (94 runs sampled)
DateTime#toRelativeCalendar x 5,232 ops/sec ±2.63% (92 runs sampled)

After

DateTime.local x 1,271,155 ops/sec ±2.69% (94 runs sampled)
DateTime.fromObject with locale x 466,883 ops/sec ±0.68% (95 runs sampled)
DateTime.local with numbers x 360,438 ops/sec ±0.58% (94 runs sampled)
DateTime.fromISO x 125,678 ops/sec ±0.83% (94 runs sampled)
DateTime.fromSQL x 188,691 ops/sec ±0.67% (93 runs sampled)
DateTime.fromFormat x 62,886 ops/sec ±0.48% (94 runs sampled)    <- 2x speedup
DateTime.fromFormat no cache x 9,227 ops/sec ±1.58% (88 runs sampled)    <- new
DateTime.fromFormat with zone x 16,462 ops/sec ±1.32% (97 runs sampled)
DateTime.fromFormat with non-English locale x 39,728 ops/sec ±0.64% (93 runs sampled)    <- 10x speedup, new
DateTime.fromFormat with non-English locale no cache x 2,780 ops/sec ±1.21% (89 runs sampled)    <- new
DateTime#setZone x 79,141 ops/sec ±0.51% (93 runs sampled)
DateTime#toFormat x 525,082 ops/sec ±0.46% (89 runs sampled)
DateTime#toFormat with macro x 165,672 ops/sec ±0.86% (91 runs sampled)
DateTime#toFormat with macro no cache x 5,250 ops/sec ±3.51% (80 runs sampled)
DateTime#add x 127,340 ops/sec ±14.36% (93 runs sampled)
DateTime#toISO x 3,297,937 ops/sec ±0.51% (93 runs sampled)
DateTime#toLocaleString x 213,467 ops/sec ±0.84% (93 runs sampled)
DateTime#toLocaleString in utc x 80,442 ops/sec ±5.44% (81 runs sampled)
DateTime#toRelativeCalendar x 11,187 ops/sec ±0.97% (86 runs sampled)    <- 2x speedup

linux-foundation-easycla[bot] commented 11 months ago

The committers listed above are authorized under a signed CLA.

:white_check_mark: login: moh-incom / name: Mads Overgård Henningsen (adc537259ea118d7d29e8f11f03989ac9ffc0fbf, 4bcc4c3ef28a8b4366b788a463bdca830e44aee6, 9aa823bd64d900e4f12cb928b711c341dee47585, 8b48f1f3ef5d9acbe10afc88e0e37734e94ebd96, 7fd1c6806fe07879762d0080011641561db00e8a, 9d928828b5faff45af91981df8e718b0416f3c86)

icambron commented 10 months ago

@moh-incom This seems reasonable, but would be nice if the benchmark was included in the PR (see the benchmarks dir)

moh-incom commented 10 months ago

I was unaware of the benchmarks directory. Are there any instructions on running the benchmarks? The straight-forward guess npm run benchmark doesn't work for me (gives me a cryptic ESM/CommonJS incompatibility error)

From the looks of it, the existing benchmarks DateTime.fromFormat and DateTime.fromFormat with zone should already cover what my benchmark does.

icambron commented 10 months ago

I pushed up a fix for npm run benchmark

moh-incom commented 10 months ago

I found out a test from the test suite was failing because I was caching based on the Locale.locale field which does not contain the numbering system. I now cache on Locale.intl instead. This should cover all the problematic cases.

I got the benchmarks running after you fixed them. In addition, I added some more to more easily test the impact of my optimizations.

Even though it might be a bit outside the scope of this PR, I also added similar caching functionality to the Locale.isEnglish function. This speeds up DateTime.fromFormat incredibly when run with a non-English locale, which is also covered in one of the new benchmarks. If you prefer, I can move it to a different PR.

moh-incom commented 7 months ago

Any progress on approving this or #1581? I do not have a strong preference as to which of the two is used, but I could really use the speedup soon

icambron commented 7 months ago

Closing this in favor of #1581. Thanks for your work on this!

moment / luxon

Cache unitate function #1530

Edit

Before (Luxon 3.4.4)

After