canonical / mir-ci

Mir CI helpers
1 stars 1 forks source link

Benchmarking for the DisplayServer and its spawned application #40

Closed mattkae closed 10 months ago

mattkae commented 10 months ago

Benchmarking

What's new?

How to test

  1. git checkout feature/benchmarks
  2. cd mir-ci/mir-ci 3 . pip install -e ..
  3. pytest --junitxml=junit.xml test_apps_can_run.py
  4. When that finishes, open up junit.xml and view the test results

What do the test results look like in junit.xml?

Like this:

...
    <testcase classname="test_apps_can_run.TestAppsCanRun" name="test_app_can_run[mir_demo_server-qterminal]" file="test_apps_can_run.py" line="12" time="3.153">
      <properties>
        <property name="compositor_cpu_time_microseconds" value="114179"></property>
        <property name="compositor_max_mem_bytes" value="58060800"></property>
        <property name="compositor_avg_mem_bytes" value="56726775.172413796"></property>
        <property name="client_cpu_time_microseconds" value="584842"></property>
        <property name="client_max_mem_bytes" value="37314560"></property>
        <property name="client_avg_mem_bytes" value="32306846.896551725"></property>
      </properties>
    </testcase>

...
codecov[bot] commented 10 months ago

Codecov Report

Merging #40 (590a9a2) into main (ac0afd0) will increase coverage by 15.63%. The diff coverage is 94.97%.

@@             Coverage Diff             @@
##             main      #40       +/-   ##
===========================================
+ Coverage   46.63%   62.26%   +15.63%     
===========================================
  Files           8       10        +2     
  Lines         491      652      +161     
  Branches       56       76       +20     
===========================================
+ Hits          229      406      +177     
+ Misses        249      226       -23     
- Partials       13       20        +7     
Files Coverage Δ
mir-ci/mir_ci/apps.py 92.30% <100.00%> (+4.80%) :arrow_up:
mir-ci/mir_ci/conftest.py 42.24% <85.71%> (+1.01%) :arrow_up:
mir-ci/mir_ci/display_server.py 76.47% <92.30%> (+43.13%) :arrow_up:
mir-ci/mir_ci/benchmarker.py 97.05% <97.05%> (ø)
mir-ci/mir_ci/cgroups.py 96.66% <96.66%> (ø)
mir-ci/mir_ci/program.py 89.58% <86.95%> (+3.68%) :arrow_up:

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

Saviq commented 10 months ago
1. Give your user permissions to modify `/sys/fs/cgroup` (or be root)

I think we should get the path from the test runner / environment, in which we'd create a subdirectory and start from there.

What do the test results look like in junit.xml?

I think we can skip the PID there. Is there another number than CPU percent that would be more meaningful across devices? I know you can only ever compare apples to apples, but…

mattkae commented 10 months ago
1. Give your user permissions to modify `/sys/fs/cgroup` (or be root)

I think we should get the path from the test runner / environment, in which we'd create a subdirectory and start from there.

What do the test results look like in junit.xml?

I think we can skip the PID there. Is there another number than CPU percent that would be more meaningful across devices? I know you can only ever compare apples to apples, but…

  1. Yeah we could create a path as root and then give the user permissions to read/write to that path!
  2. We can skip PID, I agree
  3. We could show cycles perhaps? Or maybe avg cycles per second?
mattkae commented 10 months ago

I would like to see if we can do good enough without polling? Do cgroups stats not give us enough numbers to calculate averages?

If we're going with cgroups after all, the added complexity of on_started and polling feels wasteful, WDYT? May be needed later for GPU testing, though, so maybe we shouldn't be throwing it away.

There's enough print()s all around that maybe it's time to get a logger going? Alternatively, there's the warnings module.

mattkae commented 10 months ago

@Saviq Do you have any opinion on what CPU benchmark we should report? Right now, I'm showing "average CPU percentage", but that might not be ideal. I have access to the total CPU usage in microseconds

mattkae commented 10 months ago
  • The tricky part is getting average memory usage. AFAIK, cgroups just reports stats on the current memory usage (and the max). Without polling, we wouldn't have any idea about the average

There's a lot of numbers in e.g. memory.stat, none of them is cumulative? If there is something, we could divide over the test duration?

@Saviq Do you have any opinion on what CPU benchmark we should report? Right now, I'm showing "average CPU percentage", but that might not be ideal. I have access to the total CPU usage in microseconds

Yeah that would be more resistant to hardware changes I think.