QubesOS / qubes-issues

The Qubes OS Project issue tracker
https://www.qubes-os.org/doc/issue-tracking/
539 stars 48 forks source link

qrexec: support logging how long each part of processing takes #9363

Open DemiMarie opened 3 months ago

DemiMarie commented 3 months ago

How to file a helpful issue

The problem you're addressing (if any)

If one wants to make qrexec faster, one needs to know what parts of qrexec are the bottleneck. Right now, this isn’t easy unless one patches the code. Yes, it’s Python and not hard to modify, but it would be nice to not need to do that.

The solution you'd like

Provide a flag (off by default) that spews performance information to the system log.

The value to a user, and who that user might be

People who want to make qrexec faster will have an easier time getting real data and not file issues like #6266 or #9361 that do not address the actual problems.

Completion criteria checklist

(This section is for developer use only. Please do not modify it.)

3hhh commented 3 months ago

Related: #7075

I'd also be interested in this.

However at least on my system 40% (2,7s of 6.6s, kernel & Xen time excluded) of the qrexec wait time comes from systemd attempting to reach network.target. This was different back in 4.0 when qrexec didn't need to wait for the network stack as it didn't use sockets.

marmarek commented 3 months ago

However at least on my system 40% (2,7s of 6.6s, kernel & Xen time excluded) of the qrexec wait time comes from systemd attempting to reach network.target

This sounds like you talk about system startup time. This is not what this issue is about. This one is about measuring individual stages of individual qrexec calls.

This was different back in 4.0 when qrexec didn't need to wait for the network stack as it didn't use sockets.

I don't think it's related to recently added socket service feature, there was no change in service dependencies.

3hhh commented 3 months ago

On 7/21/24 19:58, Marek Marczykowski-Górecki wrote:

However at least on my system 40% (2,7s of 6.6s, kernel & Xen time excluded) of the qrexec wait time comes from systemd attempting to reach network.target

This sounds like you talk about system startup time. This is not what this issue is about. This one is about measuring individual stages of individual qrexec calls.

Yes, I'm aware of that. I was merely trying to make a point about the limitations of this issue.

This was different back in 4.0 when qrexec didn't need to wait for the network stack as it didn't use sockets.

I don't think it's related to recently added socket service feature, there was no change in service dependencies.

If you still happen to have a 4.0 install available, compare the output of systemd-analyze critical-chain qubes-qrexec-agent.service with that of 4.2 on a comparable VM with similar memory size etc. Unfortunately I didn't find it in my historical data, so you may be correct after all.