jlfwong / speedscope

🔬 A fast, interactive web-based viewer for performance profiles.
https://www.speedscope.app
MIT License
5.58k stars 247 forks source link

Stackprof: weight on-cpu samples by period rather than timestamp delta #425

Closed manuelfelipe closed 1 year ago

manuelfelipe commented 1 year ago

This attempts to improve the quality of the on-CPU profiles stackprof provides. Rather than weighing samples by their timestamp deltas, which, in our opinion, are only valid in wall-clock mode, this weighs callchains by:

S = number of samples
P = sample period in nanoseconds

W = S * P

The difference after this change is quite substantial, specially in profiles that previously were showing up with heavy IO frames:

In delta mode, we see an io select taking a long time, it is a significant portion of the profile:

236936508-709bee01-d616-4246-ba74-ab004331dcd3

But in period scaling mode, it is only a couple of sample periods ultimately:

236936693-9d44304e-a1c2-4906-b3c8-50e19e6f9f27
jlfwong commented 1 year ago

Hi @manuelfelipe!

This seems reasonable to me! Can you please include tests which would catch changes to this behaviour were it to break. I think this would mean including a new profile with the cpu mode set in https://github.com/jlfwong/speedscope/blob/main/src/import/stackprof.test.ts

coveralls commented 1 year ago

Coverage Status

coverage: 42.146% (-0.01%) from 42.16% when pulling 2b92d51817072d50d69a5827adce7061f14b581e on dalehamel:stackprof-cpu-weight-fixes-upstream into e9133be353b03a75fa4bf7c6c5990311c43e9879 on jlfwong:main.

manuelfelipe commented 1 year ago

This seems reasonable to me! Can you please include tests which would catch changes to this behaviour were it to break. I think this would mean including a new profile with the cpu mode set in https://github.com/jlfwong/speedscope/blob/main/src/import/stackprof.test.ts

Alright, added a sample profile along with a test asserting for the desired behaviour in https://github.com/jlfwong/speedscope/pull/425/commits/7ad10a136beca5da4aaae89bff6e7dbd9aec1ed9

On master, it fails (as expected) with:

  ● importCpuProfileWithProperWeights › uses samples count for weight when importing cpu profile

    expect(received).toBe(expected) // Object.is equality

    Expected: 489000
    Received: 6650772

      28 |       if (profile) {
      29 |         expect(profile.getWeightUnit()).toBe('microseconds')
    > 30 |         expect(profile.getTotalWeight()).toBe(489000)
         |                                          ^
      31 |       }
      32 |     }
      33 |   })

Also added couple of changes to the simple ruby app to allow changing the mod via an arg value, and some sleep time to showcase how weight is influenced when capturing cpu profiles

Before this change:

Screenshot 2023-06-07 at 10 11 23 AM

After

Screenshot 2023-06-07 at 10 11 52 AM
jlfwong commented 1 year ago

Thanks for adding the test and ensuring that it fails on the main branch!

Looks like there are failing test runs & a merge conflict that needs resolving before this can land.

manuelfelipe commented 1 year ago

thanks @jlfwong. Rebased base on latest changes on main, and re-ran run jest -- -u in https://github.com/jlfwong/speedscope/pull/425/commits/2b92d51817072d50d69a5827adce7061f14b581e to try and get CI happy.

Not sure I fully understand what is the actual thing failing failure in https://github.com/jlfwong/speedscope/actions/runs/5201005116/jobs/9543115054. Locally tests are passing, so hoping the thing that was causing the issue are the outdated snapshots.

dalehamel commented 1 year ago

FYI @jlfwong I enabled actions on my fork, Manuel pushed to it, and it passes there https://github.com/dalehamel/speedscope/actions/runs/5280170594 (since we cannot run workflows here, this seemed the most expedient way).

jlfwong commented 1 year ago

Thanks! This will go out with the next release

jlfwong commented 1 year ago

This is now live on https://speedscope.app and published to npm as part of v1.15.2. Thanks for your contribution!