DIAGNijmegen / rse-cirrus-benchmark-suite

A small playwright-based CIRRUS benchmark suite, meant to detect speedups or slowdowns.
Apache License 2.0
0 stars 0 forks source link

Testing results #2

Closed HarmvZ closed 1 month ago

HarmvZ commented 1 month ago

I tested the suite and found a few small things:

╔════════════════════════════════════════════════════════════╗
║ Looks like Playwright was just installed or updated.       ║
║ Please run the following command to download new browsers: ║
║                                                            ║
║     playwright install                                     ║
║                                                            ║
║ <3 Playwright Team                                         ║
╚════════════════════════════════════════════════════════════╝

So I ran

$ python3.11 -m poetry run playwright install

But this returned:

╔══════════════════════════════════════════════════════╗
║ Host system is missing dependencies to run browsers. ║
║ Please install them with the following command:      ║
║                                                      ║
║     sudo playwright install-deps                     ║
║                                                      ║
║ Alternatively, use apt:                              ║
║     sudo apt-get install libnss3\                    ║
║         libnspr4\                                    ║
║         libatk1.0-0\                                 ║
║         libatk-bridge2.0-0\                          ║
║         libcups2\                                    ║
║         libatspi2.0-0\                               ║
║         libxcomposite1\                              ║
║         libxdamage1\                                 ║
║         libxfixes3\                                  ║
║         libxrandr2\                                  ║
║         libgbm1\                                     ║
║         libxkbcommon0\                               ║
║         libpango-1.0-0\                              ║
║         libcairo2\                                   ║
║         libasound2                                   ║
║                                                      ║
║ <3 Playwright Team                                   ║
╚══════════════════════════════════════════════════════╝

So I ran

$ python3.11 -m poetry run playwright install-deps
image

Output

harm@laptop-harm:~/rse-CIRRUS-benchmark-suite$ python3.11 -m poetry run python cirrus_benchmark_suite/benchmark.py
!! Running in DEBUG mode, benchmarks might be unreliable !!
Base-line page loading: 79±19ms
Traceback (most recent call last):
  File "/home/harm/rse-CIRRUS-benchmark-suite/cirrus_benchmark_suite/benchmark.py", line 212, in <module>
    test()
  File "/home/harm/rse-CIRRUS-benchmark-suite/cirrus_benchmark_suite/benchmark.py", line 190, in test
    benchmarks = benchmark(ctx, session_url)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/harm/rse-CIRRUS-benchmark-suite/cirrus_benchmark_suite/benchmark.py", line 137, in benchmark
    benchmark_archive_item(benchmarks, page, session_url)
  File "/home/harm/rse-CIRRUS-benchmark-suite/cirrus_benchmark_suite/benchmark.py", line 119, in benchmark_archive_item
    ).to_be_visible(timeout=20_000)
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/harm/.cache/pypoetry/virtualenvs/cirrus-benchmark-suite-HSgP5Tem-py3.11/lib/python3.11/site-packages/playwright/sync_api/_generated.py", line 19704, in to_be_visible
    self._sync(self._impl_obj.to_be_visible(visible=visible, timeout=timeout))
  File "/home/harm/.cache/pypoetry/virtualenvs/cirrus-benchmark-suite-HSgP5Tem-py3.11/lib/python3.11/site-packages/playwright/_impl/_sync_base.py", line 115, in _sync
    return task.result()
           ^^^^^^^^^^^^^
  File "/home/harm/.cache/pypoetry/virtualenvs/cirrus-benchmark-suite-HSgP5Tem-py3.11/lib/python3.11/site-packages/playwright/_impl/_assertions.py", line 664, in to_be_visible
    await self._expect_impl(
  File "/home/harm/.cache/pypoetry/virtualenvs/cirrus-benchmark-suite-HSgP5Tem-py3.11/lib/python3.11/site-packages/playwright/_impl/_assertions.py", line 74, in _expect_impl
    raise AssertionError(
AssertionError: Locator expected to be visible
Actual value: <element(s) not found>
Call log:
LocatorAssertions.to_be_visible with timeout 20000ms
  - waiting for locator("[data-plugin-name=\"AnnotationListPlugin\"]")

So I guess I'm missing access to some archive item that is tested?

chrisvanrun commented 1 month ago

Thanks Harm! The extra dependencies are not completely unexpected. I had considered a containerized approach but having a debug-headed run-through is sort of nice!

I'll get down to tweaking some things here.

chrisvanrun commented 1 month ago

@HarmvZ , I've updated the main branch. Could I ask you to try again? The archive item seemed to only have 1 of your accounts as reader, so that might have caused the problems.

Do note that the name of the repo has updated, and you'll need to lowercase the 'CIRRUS'.

HarmvZ commented 1 month ago

It works 🎉

harm@laptop-harm:~/rse-CIRRUS-benchmark-suite$ python3.11 -m poetry run python cirrus_benchmark_suite/benchmark.py
!! Running in DEBUG mode, benchmarks might be unreliable !!
Base-line page loading: 82±17ms
## readerstudy.loading_first_case
### Runtime: 16306ms
P-value: 0.000%
        Probability of getting this runtime under the assumption that it is from the reference distribution: a low value suggests an outlier.
Reference distribution (N=36):
Average±SEM: 11173±209ms
33rd Percentile: 11081ms
66th Percentile: 11260ms

---
## readerstudy.navigate_to_second_case
### Runtime: 1317ms
P-value: 71.743%
        Probability of getting this runtime under the assumption that it is from the reference distribution: a low value suggests an outlier.
Reference distribution (N=36):
Average±SEM: 1324±20ms
33rd Percentile: 1315ms
66th Percentile: 1332ms

---
## algorithmjob.loading
### Runtime: 12269ms
P-value: 0.000%
        Probability of getting this runtime under the assumption that it is from the reference distribution: a low value suggests an outlier.
Reference distribution (N=36):
Average±SEM: 11246±127ms
33rd Percentile: 11191ms
66th Percentile: 11299ms

---
## archiveitem.loading
### Runtime: 13285ms
P-value: 0.000%
        Probability of getting this runtime under the assumption that it is from the reference distribution: a low value suggests an outlier.
Reference distribution (N=36):
Average±SEM: 11238±104ms
33rd Percentile: 11193ms
66th Percentile: 11281ms

---
Benchmarking finished!
Total runtime: 79.8 seconds
chrisvanrun commented 4 weeks ago

Jeej. Interesting to see that yours runs a lot slower =P