krausest / js-framework-benchmark

A comparison of the performance of a few popular javascript frameworks
https://krausest.github.io/js-framework-benchmark/
Apache License 2.0
6.69k stars 830 forks source link

Anomalous "script bootup time" for lit-v2.0.0-rc.1? #890

Open kevinpschaaf opened 3 years ago

kevinpschaaf commented 3 years ago

We looked into an unexpected regression in "script bootup time" between lit-element-v2.4.0 and lit-v2.0.0-rc.1 that shows up in the (yet unpublished) version of table.html on master, and cannot reproduce it locally.

(because lit-element-v2.4.0 was replaced with lit-v2.0.0-rc.1, below compares against lit-html as a stable baseline)

On master:

image

Local:

image

Would it be possible to re-run/re-check the results before publishing?

krausest commented 3 years ago

Surely. For the startup benchmark I'm relying on lighthouse and running 4 runs. I just repeated it for lit on my linux razer blade machine and got the following results:

[
  {
    TimeToConsistentlyInteractive: 2182.092,
    ScriptBootUpTtime: 16,
    MainThreadWorkCost: 216.2119999999999,
    TotalKiloByteWeight: 163.5859375
  },
  {
    TimeToConsistentlyInteractive: 2360.6275000000005,
    ScriptBootUpTtime: 73.11599999999999,
    MainThreadWorkCost: 442.18399999999997,
    TotalKiloByteWeight: 163.5859375
  },
  {
    TimeToConsistentlyInteractive: 2182.0650000000005,
    ScriptBootUpTtime: 16,
    MainThreadWorkCost: 245.80799999999994,
    TotalKiloByteWeight: 163.5859375
  },
  {
    TimeToConsistentlyInteractive: 2180.868,
    ScriptBootUpTtime: 16,
    MainThreadWorkCost: 220.63199999999992,
    TotalKiloByteWeight: 163.5859375
  }
]

In this case the run results in a mean of 30.28 and a median of 16 with a std deviation of 29 for the bootup time. So currently I'd say in my run (41.9) there were two slow outliers such that the median was above 16 ms. In your run the std deviation is about 17, so I guess there were three 16ms runs and one outliner.

Can you please post the file webdriver-ts/results/lit-v2.0.0-rc.1-keyed_32_startup-bt.json?

Most frameworks have a 0 std deviation in the script bootup time, notable exceptions with a higher std deviation are lit, most react implementations and most react-redux implementations, This leads to the question why do only some frameworks show a significant std deviation? Are those outliers real or just a wrong measurement?

krausest commented 3 years ago

I tried running lighthouse on the command line and comparing the results: lighthouse http://localhost:8080/frameworks/keyed/lit/index.html --output=json --only-audits=bootup-time | less

The results are not really comparable to what the benchmark driver reports, but also show some noise (between 62 and 112 msecs on my machine). Please note that values below 16 msecs are clamped to 16 msecs.

Without clamping results look like that:

******* result  [
  {
    TimeToConsistentlyInteractive: 2031.44,
    ScriptBootUpTtime: 6,
    MainThreadWorkCost: 214.15599999999995,
    TotalKiloByteWeight: 163.5869140625
  },
  {
    TimeToConsistentlyInteractive: 2180.463,
    ScriptBootUpTtime: 6.359999999999999,
    MainThreadWorkCost: 213.55999999999995,
    TotalKiloByteWeight: 163.5869140625
  },
  {
    TimeToConsistentlyInteractive: 2180.4045,
    ScriptBootUpTtime: 7.9319999999999995,
    MainThreadWorkCost: 221.7479999999999,
    TotalKiloByteWeight: 163.583984375
  },
  {
    TimeToConsistentlyInteractive: 2180.7870000000003,
    ScriptBootUpTtime: 6.5680000000000005,
    MainThreadWorkCost: 218.65199999999993,
    TotalKiloByteWeight: 163.5849609375
  }
]

But also like

[
  {
    TimeToConsistentlyInteractive: 2181.4845,
    ScriptBootUpTtime: 8.395999999999999,
    MainThreadWorkCost: 213.74799999999993,
    TotalKiloByteWeight: 163.5849609375
  },
  {
    TimeToConsistentlyInteractive: 2368.5159999999996,
    ScriptBootUpTtime: 78.04799999999999,
    MainThreadWorkCost: 444.8519999999999,
    TotalKiloByteWeight: 163.5849609375
  },
  {
    TimeToConsistentlyInteractive: 2181.4485,
    ScriptBootUpTtime: 6.728000000000001,
    MainThreadWorkCost: 217.408,
    TotalKiloByteWeight: 163.5869140625
  },
  {
    TimeToConsistentlyInteractive: 2180.1975,
    ScriptBootUpTtime: 7.9159999999999995,
    MainThreadWorkCost: 222.19199999999992,
    TotalKiloByteWeight: 163.5869140625
  }
]
krausest commented 3 years ago

At least temporarily I removed the 16ms cap on script bootup time and rerun all impleentations. This times variance was pretty low for lit: https://krausest.github.io/js-framework-benchmark/current.html .

kevinpschaaf commented 3 years ago

Sorry for the delay. Here's my webdriver-ts/results/lit-v2.0.0-rc.1-keyed_32_startup-bt.json:

{
  "framework": "lit-v2.0.0-rc.1-keyed",
  "keyed": true,
  "benchmark": "32_startup-bt",
  "type": "startup",
  "min": 16,
  "max": 16,
  "mean": 16,
  "median": 16,
  "geometricMean": 16,
  "standardDeviation": 0,
  "values": [16, 16, 16, 16]
}

Yeah the big 10x outlier(s) are interesting; hard to evaluate where that's coming from.