npm / node-semver

The semver parser for node (the one npm uses)
ISC License
5.09k stars 492 forks source link

fix: optimize Range parsing and formatting #726

Closed jviide closed 3 months ago

jviide commented 3 months ago

This pull request optimizes the Range class in the following ways:

  1. Produce fewer intermediate objects when reducing a range's space characters to single spaces. This seems to improve bench-subset scores by up to 5%, and bench-satisfies scores to a lesser degree.
  2. Optimize Range formatting with explicit for loops instead, avoiding an intermediate array creation. This seems to improve bench-satisfies and bench-subset scores by up to 20%.
  3. Calculate Range's .range string (used by .format() and .toString()) lazily. This seems to improve bench-satisfies and bench-subset scores by up to 9%.

The external interface for the class stays the same, except for the new internal .formatted property used to cache its lazily calculated string. Range#range property is now also read-only.

There is a new test lazy formatting to ensure full test coverage.

The benchmarks bench-satisfies and bench-subset benefit from these changes, sometimes by up to 40%. Other benchmark results seem to stay the same. Here are the affected benchmarks before:

$ node benchmarks/bench-satisfies.js
satisfies(1.0.6, 1.0.3||^2.0.0) x 695,094 ops/sec ±0.68% (97 runs sampled)
satisfies(1.0.6, 2.2.2||~3.0.0) x 764,115 ops/sec ±0.40% (99 runs sampled)
satisfies(1.0.6, 2.3.0||<4.0.0) x 805,593 ops/sec ±0.62% (97 runs sampled)
satisfies(1.0.6, 1.0.3||^2.0.0, {"includePrelease":true}) x 695,045 ops/sec ±0.73% (95 runs sampled)
satisfies(1.0.6, 2.2.2||~3.0.0, {"includePrelease":true}) x 750,433 ops/sec ±0.66% (99 runs sampled)
satisfies(1.0.6, 2.3.0||<4.0.0, {"includePrelease":true}) x 787,903 ops/sec ±0.39% (99 runs sampled)
satisfies(1.0.6, 1.0.3||^2.0.0, {"includePrelease":true,"loose":true}) x 652,166 ops/sec ±0.34% (99 runs sampled)
satisfies(1.0.6, 2.2.2||~3.0.0, {"includePrelease":true,"loose":true}) x 696,377 ops/sec ±0.36% (96 runs sampled)
satisfies(1.0.6, 2.3.0||<4.0.0, {"includePrelease":true,"loose":true}) x 721,729 ops/sec ±0.35% (98 runs sampled)
satisfies(1.0.6, 1.0.3||^2.0.0, {"includePrelease":true,"loose":true,"rtl":true}) x 585,692 ops/sec ±0.75% (95 runs sampled)
satisfies(1.0.6, 2.2.2||~3.0.0, {"includePrelease":true,"loose":true,"rtl":true}) x 631,653 ops/sec ±0.33% (96 runs sampled)
satisfies(1.0.6, 2.3.0||<4.0.0, {"includePrelease":true,"loose":true,"rtl":true}) x 650,110 ops/sec ±0.64% (95 runs sampled)

$ node benchmarks/bench-subset.js
subset(1.2.3, *) x 633,342 ops/sec ±0.61% (95 runs sampled)
subset(^1.2.3, *) x 743,036 ops/sec ±0.47% (97 runs sampled)
subset(^1.2.3-pre.0, *) x 680,087 ops/sec ±0.76% (98 runs sampled)
subset(^1.2.3-pre.0, *) x 680,948 ops/sec ±0.46% (96 runs sampled)
subset(1 || 2 || 3, *) x 330,669 ops/sec ±0.53% (98 runs sampled)

And after:

$ node benchmarks/bench-satisfies.js
satisfies(1.0.6, 1.0.3||^2.0.0) x 896,936 ops/sec ±0.53% (94 runs sampled)
satisfies(1.0.6, 2.2.2||~3.0.0) x 998,214 ops/sec ±0.40% (95 runs sampled)
satisfies(1.0.6, 2.3.0||<4.0.0) x 1,000,593 ops/sec ±0.43% (97 runs sampled)
satisfies(1.0.6, 1.0.3||^2.0.0, {"includePrelease":true}) x 890,369 ops/sec ±0.41% (100 runs sampled)
satisfies(1.0.6, 2.2.2||~3.0.0, {"includePrelease":true}) x 977,239 ops/sec ±0.48% (97 runs sampled)
satisfies(1.0.6, 2.3.0||<4.0.0, {"includePrelease":true}) x 983,682 ops/sec ±0.95% (96 runs sampled)
satisfies(1.0.6, 1.0.3||^2.0.0, {"includePrelease":true,"loose":true}) x 805,330 ops/sec ±0.84% (98 runs sampled)
satisfies(1.0.6, 2.2.2||~3.0.0, {"includePrelease":true,"loose":true}) x 894,117 ops/sec ±0.43% (99 runs sampled)
satisfies(1.0.6, 2.3.0||<4.0.0, {"includePrelease":true,"loose":true}) x 911,742 ops/sec ±0.42% (96 runs sampled)
satisfies(1.0.6, 1.0.3||^2.0.0, {"includePrelease":true,"loose":true,"rtl":true}) x 741,254 ops/sec ±0.35% (97 runs sampled)
satisfies(1.0.6, 2.2.2||~3.0.0, {"includePrelease":true,"loose":true,"rtl":true}) x 807,380 ops/sec ±0.42% (99 runs sampled)
satisfies(1.0.6, 2.3.0||<4.0.0, {"includePrelease":true,"loose":true,"rtl":true}) x 820,390 ops/sec ±0.37% (99 runs sampled)

$ node benchmarks/bench-subset.js
subset(1.2.3, *) x 905,030 ops/sec ±0.63% (96 runs sampled)
subset(^1.2.3, *) x 1,026,457 ops/sec ±0.63% (95 runs sampled)
subset(^1.2.3-pre.0, *) x 923,789 ops/sec ±0.41% (97 runs sampled)
subset(^1.2.3-pre.0, *) x 923,136 ops/sec ±0.44% (96 runs sampled)
subset(1 || 2 || 3, *) x 432,037 ops/sec ±0.67% (96 runs sampled)
jviide commented 3 months ago

The "refactor:" pull request name prefix was not allowed so I picked "fix:", and changed the commit messages to use the same prefix as well.

wraithgar commented 3 months ago

We generally squash-merge external PRs unless there's a compelling reason to keep each commit. linting will pass if either the PR title or every commit has a valid conventional prefix.

Don't worry about keeping the commit messages "valid" as we go through this PR, as we'll be using the PR title if and when this lands.

wraithgar commented 3 months ago

Thanks for the help in reviewing this @kurtextrem. Yes, that comment was referring to s* which is a classic ReDOS vector. Historically we've used the .split approach to fix that, because it's rarely in a hot path where performance matters as much as it does here. Good to know that String.replace is faster.