Closed mcomella closed 3 years ago
desc | mean | median | max |
---|---|---|---|
normal | 292.33 | 283 | 379 |
add-ons | 364.27 | 356 | 525 |
busy (500% CPU) | 555.53 | 494 | 1273 |
busy (800% CPU) | 621.33 | 545 | 1726 |
2x busy 800% | |||
FxA (blocked on https://github.com/mozilla-mobile/fenix/issues/17575) | |||
lower power? |
I had to test this on an emulator, rather than my G5, so it has separate results: | desc | mean | median | max |
---|---|---|---|---|
slow network | 189 | 171 | 343 |
Notes:
adb shell am start -W
's "TotalTime", which is 100-200ms more than if we logcatted the onDraw
call)My script for running these tests is:
echo "" > output.txt; for i in `seq 1 15`; do
adb shell am start -W -t 'text/html' -d 'https://mozilla-mobile.github.io/perf-tools/mozperftest-test-page.html' -a android.intent.action.VIEW org.mozilla.fenix/org.mozilla.fenix.IntentReceiverActivity | grep "TotalTime" | cut -d ' ' -f 2 | tee -a output.txt
sleep 2
adb shell input keyevent KEYCODE_BACK
sleep 1
done
I accidentally updated my latest nightly so I started new results: | desc | mean | median | max |
---|---|---|---|---|
normal | 340.13 | 322.0 | 513.0 | |
open tabs (100+ of example.com) | 407.93 | 386.0 | 607.0 | |
open tabs (alexa top 50-ish) | 366.33 | 347.0 | 536.0 | |
FxA (signed in maybe 15s before test so perhaps syncing; tiny profile) | 344.8 | 350.0 | 458.0 | |
busy (2x 800% CPU) | 1254.8 | 470.0 | 9541.0 | |
busy (2x 800% CPU) again, didn't restart process before running | 480.8 | 466.0 | 656.0 |
Notes:
[9541.0, 1678.0, 1776.0, 470.0, 857.0, 672.0, 452.0, 536.0, 404.0, 456.0, 558.0, 373.0, 359.0, 356.0, 334.0]
Given how much longer the first loads are compared to subsequent ones (in particular open tabs and busy), if we're really looking for performance cliffs, perhaps we should be stopping and starting the process before measuring the next replicate.
It's curious the normal start time increasing from last time...
I took a profile of WARM LINK (the page might be cached) when the device is under the 800% CPU background app load: https://share.firefox.dev/2NAGtc2
However, I noticed the flame graph and stack chart do not line up so I'm not sure how trustworthy it is (the former seems to be calculated from sample count while the latter is from runtime).
When the device is under load, I'd expect the UI thread (in addition to the gecko thread) to be throttled for heat concerns – as such, I'm not sure that there'd be anything we can do about it. Perhaps we should bucket our start up time telemetry into "device under load" and "device not under load" so we can do separate analyses.
Then again, in practice, how often are Android devices under heavy load when they're starting apps?
I filtered the Google Play Console slow warm start by device. For the G5, 18.3% are considered "slow warm starts" (> 2s). In my local testing (on an empty device), a first replicate seemed to be ~400-600ms (to be fair, I took a very limited number of samples) so we can probably trust that our telemetry data is pointing to a real problem.
Google's definition of WARM also includes having to restart the process if the system saved some state in a bundle though. Unclear how many of these cases that represents
I looked into GPlay console and Firebase Perf Monitoring to understand if they can help us identify perf cliffs. I wrote up a brief analysis: https://docs.google.com/document/d/1FWjM5gQgAlgm8d7m28gau7lHi-sKTDBhnAdvnwmpMr0/edit#
I can't think of anything else super actionable to do here without data analysis to confirm that this is a real problem we're seeing.
Let's repurpose this for now: with ecsmyth changing projects, this bug will be to own finding the performance cliffs in warm startup in general. Potentially consider filing a follow-up bug for this.
In our brief analysis, we didn't find indication that there were perf cliffs or perf issues in WARM VIEW. As such, we're decided to focus on improving start up generally and adding simple telemetry, which might point us in specific directions.
Closing as there's nothing else to do here.
We should do local experiments to try to figure out when WARM VIEW may result in a performance cliff so that we can correlate that with the telemetry data. Here are a few cases we thought of that we should check:
We may need to modify the code to make accessing the start up time trivial: e.g. output a log with the time each time start up completes. It may help to come up with a quick script to run warm start ups too so we can get more than a single run.
┆Issue is synchronized with this Jira Task