Closed asurdej-comcast closed 2 months ago
Would be nice to profile this instead of working-around the issue.
Moving to draft as got report that this patch may cause network package drops and video macrobloking. Need more analysis
Checked a bit with perf
here on my x86_64 desktop, the main WebProcess bottleneck seems to be in BoringSSL... aes_nohw_encrypt_batch
... I see the fipsmodule/aes has optimizations for ARM. Are they enabled in your build?
I see the libwebrtc cmake stuff passing -DOPENSSL_NO_ASM
to the build so... I doubt you have ASM enabled, unless you modified the build downstream.
We don't think we have any custom build settings for webrtc. We use exactly the same config as 2.38 has. btw, I'm off this week so will continue analysis on Monday and my responses may be delayed
Hi @philn quick update, I checked AmazonLuna gameplay (Fallout) with perf
tool on my device (broadcom) and got pretty similar results as you shared. top 5 come from boring ssl. We had ASM optimization disabled with OPENSSL_NO_ASM option. After enabling it (by brute force as webkit doesn't use boringssl build system GN but takes sources directly) things looks much better. With default Webkit webrtc provider CPU usage drop is like 130% vs 100%. Including commit from this PR (WebCore webrtc provider - webrtc sockets in webprocess) it drops to ~70%.
Do you know if it is safe to enable ASM? Any drawbacks?
POC patch. Had to disable x25519-asm-arm.S because of
Error: selected processor does not support `strd r4,[sp,#0]' in ARM mode
and
Error: selected processor does not support `ldrd r4,[sp,#0]' in ARM mode
Yeah, ASM should be safe to enable imho and specially on embedded devices where CPU resources are very limited... The upstream PR is https://github.com/WebKit/WebKit/pull/27977
great, is it intended to land downstream somewhere soon?
great, is it intended to land downstream somewhere soon?
Yes, I'm doing it now. Should be in wpe-2.38 once the build passed here.
great, is it intended to land downstream somewhere soon?
Yes, I'm doing it now. Should be in wpe-2.38 once the build passed here.
Done.
We observe very high CPU usage while using webRTC in AmazonLuna gaming app. Usually it's around 150% (rough measuring with 'top' cmd). This traffic is mainly generated by WPEWebProcess ~110% and WPENetworkProcess ~20% In WPEWebProcess there are 3 threads responsible for most of the usage WebKitWebRTCNet, main thread and WebKitWebRTCSig.
With this patch CPU usage drops around 20%. WPEWebProcess usually stays below 100% while WPENetworkProcess CPU usage is not significant any more.
This patch changes default LibWebRTCProvider. Not sure if this is beneficial for others. We can flip it to keep default provider from Network and use WebCore provider only if env is set.