storaged-project / udisks

The UDisks project provides a daemon, tools and libraries to access and manipulate disks, storage devices and technologies.
https://storaged.org/doc/udisks2-api/latest/
Other
348 stars 142 forks source link

Decrease start time below 100 ms #521

Open paulmenzel opened 6 years ago

paulmenzel commented 6 years ago

On my ASRock E350M1 with coreboot with Linux 4.16 and Debian Sid/unstable with udisks2 2.7.6-2, according to the time stamps in the systemd journal (or systemd-analyze blame), the service udisks2 takes around 500 ms to start, and holds up the graphical target from completion.

How can this be debugged? strace?

paulmenzel commented 6 years ago

Stopping the service, and starting udisksd manually, it still takes over a 100 ms. Is D-Bus so slow?

paulmenzel commented 6 years ago

This is a Debian Sid/unstable system with a HDD, where the service was disabled.

plumpsklo:~$ sudo /usr/lib/udisks2/udisksd --debug
udisks-Message: 18:26:43.526: udisks daemon version 2.7.6 starting
udisks-Message: 18:26:44.324: Acquired the name org.freedesktop.UDisks2 on the system message bus
^Cudisks-Message: 18:26:51.998: udisks daemon version 2.7.6 exiting
plumpsklo:~$ sudo strace ^Csr/lib/udisks2/udisksd --debug
plumpsklo:~$ sudo /usr/lib/udisks2/udisksd --debug
udisks-Message: 18:27:44.117: udisks daemon version 2.7.6 starting
udisks-Message: 18:27:44.564: Acquired the name org.freedesktop.UDisks2 on the system message bus
^Cudisks-Message: 18:27:47.830: udisks daemon version 2.7.6 exiting
plumpsklo:~$ sudo /usr/lib/udisks2/udisksd --debug
udisks-Message: 18:27:48.644: udisks daemon version 2.7.6 starting
udisks-Message: 18:27:49.102: Acquired the name org.freedesktop.UDisks2 on the system message bus
^Cudisks-Message: 18:27:51.150: udisks daemon version 2.7.6 exiting
plumpsklo:~$ ls -lh /usr/lib/udisks2/udisksd
-rwxr-xr-x 1 root root 435K Mar  8 18:47 /usr/lib/udisks2/udisksd

So, when having to load the binary, it takes 800 ms, when it is in the cache, still 450 ms.

paulmenzel commented 6 years ago

According to sudo strace -T -tt /usr/lib/udisks2/udisksd --debug, 130 ms spent below, reading some virtual block device.


[…]
18:44:08.460275 lstat("/sys/devices/virtual/block/pktcdvd0/dm/uuid", 0x7ffd12d0f970) = -1 ENOENT (No such file or directory) <0.000006>
18:44:08.460303 futex(0x7f7c9a415f38, FUTEX_WAKE_PRIVATE, 2147483647) = 0 <0.000004>
18:44:08.460329 futex(0x7f7c9a415f38, FUTEX_WAKE_PRIVATE, 2147483647) = 0 <0.000004>
18:44:08.460355 futex(0x7f7c9a415f38, FUTEX_WAKE_PRIVATE, 2147483647) = 0 <0.000004>
18:44:08.460378 futex(0x7f7c9a415f38, FUTEX_WAKE_PRIVATE, 2147483647) = 0 <0.000004>
18:44:08.460495 lstat("/sys/devices/virtual/block/pktcdvd0/removable", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 <0.000008>
18:44:08.460518 openat(AT_FDCWD, "/sys/devices/virtual/block/pktcdvd0/removable", O_RDONLY|O_CLOEXEC) = 13 <0.000008>
18:44:08.460538 fstat(13, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 <0.000004>
18:44:08.460557 fstat(13, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 <0.000004>
18:44:08.460578 read(13, "1\n", 4096)   = 2 <0.000007>
18:44:08.460597 read(13, "", 4096)      = 0 <0.000004>
18:44:08.460613 close(13)               = 0 <0.000005>
18:44:08.460631 openat(AT_FDCWD, "/dev/pktcdvd/pktcdvd0", O_RDONLY) = -1 ENOMEDIUM (No medium found) <0.122780>
18:44:08.583450 lstat("/sys/devices/virtual/block/pktcdvd0/ro", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 <0.000038>
18:44:08.583535 openat(AT_FDCWD, "/sys/devices/virtual/block/pktcdvd0/ro", O_RDONLY|O_CLOEXEC) = 13 <0.000036>
18:44:08.583586 fstat(13, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 <0.000005>
18:44:08.583605 fstat(13, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 <0.000004>
[…]
[udisks2_2.7.6–strace-T-tt.txt](https://github.com/storaged-project/udisks/files/1895554/udisks2_2.7.6.strace-T-tt.txt)
vpodzime commented 6 years ago

Thanks for the valuable input! It is a bit weird that probing that device takes so long, but it looks like those 120ms are spent in kernel when trying to open that device (and kernel probing for medium). There's nothing udisks could do about it. But of course, ideas welcome!

tbzatek commented 5 years ago

udisks tries to enumerate devices on startup so that is guaranteed all objects are populated when it appears on the bus. It also sends various ATA probe commands to some devices, such as the CD-ROM drives in your case. And with modules enabled startup could easily take even more time.

Question is why is your system blocked waiting on udisks - is it just a matter of wrong dependencies? Udisks is an autospawned d-bus daemon and clients are supposed to wait for its startup to get actual state of block devices in the system.

paulmenzel commented 5 years ago

On 04/15/2019 02:09 PM, Tomáš Bžatek wrote:

udisks tries to enumerate devices on startup so that is guaranteed all objects are populated when it appears on the bus. It also sends various ATA probe commands to some devices, such as the CD-ROM drives in your case. And with modules enabled startup could easily take even more time.

That sounds broken, as devices can be plugged in at run-time any time. Shouldn’t the clients “see” the devices the moment they query them, and subscribe to some update callback(?) to be notified, when new devices are discovered?

Question is why is your system blocked waiting on udisks - is it just a matter of wrong dependencies? Udisks is an autospawned d-bus daemon and clients are supposed to wait for its startup to get actual state of block devices in the system.

If GDM does not depend on udisks, then I guess it is just a cosmetic problem.

tbzatek commented 5 years ago

Shouldn’t the clients “see” the devices the moment they query them, and subscribe to some update callback(?) to be notified, when new devices are discovered?

Yes and no. Traditionally udisks has been designed fully async expecting clients to listen to signals and act accordingly (in fully asynchronous manner, e.g. in GUI). However recently we've seen more usage cases where clients expect objects to be fully updated once a particular sync method call returns (e.g. I/O operations over block devices or filesystems), effectively moving the burden of waiting for update to the daemon. We're obliged not to change any existing behaviour once released, yet we're still try to fix the cases where simple fix is possible (in the expense of prolonging the operation by a short time, which is usually negligible given the whole operation duration). See related #581.

Not all clients are listening to signals on the object manager and just expect the objects to be there when udisks appears on the bus. We can't change that behaviour simply for compatibility reasons.

As for missed uevents, udisks starts listening to uevents on startup before the fist (synchronous) enumeration of all available devices occurs. That said, signal processing is theoretically queued until mainloop starts processing further events, no events should be missed. Also, on startup udisks ignores devices that still being probed by udev, see #635, but we should get another uevent when it's ready.

As a potential improvement we could perform probing of devices simultaneously, however that brings a ton of new synchronization issues within the udisks core (think of multi-disk device and representation of its parent). Similar case with modules activation on startup (not enabled by default though) and housekeeping.

Also your suggestion can be applied the other way round - don't let clients wait for udisks to start, instead watch for well-known name owner and subscribe to object manager signals. Both can be done fully async in clients.

Question is why is your system blocked waiting on udisks - is it just a matter of wrong dependencies? Udisks is an autospawned d-bus daemon and clients are supposed to wait for its startup to get actual state of block devices in the system. If GDM does not depend on udisks, then I guess it is just a cosmetic problem.

I see now we inject udisks2.service systemd unit in the graphical.target. That might not be necessary after all. https://github.com/storaged-project/udisks/blob/b7149fac04616ae71a60b4bbe00943ee5d80f218/data/udisks2.service.in#L12

vpodzime commented 5 years ago

I see now we inject udisks2.service systemd unit in the graphical.target. That might not be necessary after all.

But that's only effective after the user runs systemctl enable udisks2, right? So I don't think that is the problem. Also, see https://bugs.freedesktop.org/show_bug.cgi?id=92486 for a discussion regarding this.

tbzatek commented 5 years ago

But that's only effective after the user runs systemctl enable udisks2, right? So I don't think that is the problem. Also, see https://bugs.freedesktop.org/show_bug.cgi?id=92486 for a discussion regarding this.

Right, might be just a packaging/distro problem then (e.g. the %systemd_post rpm macro). Would be interesting to find out what exactly requires/starts udisks in the reporter's case.

That freedesktop bug contains couple of notable ideas.

https://cgit.freedesktop.org/upower/commit/?id=07a6c353a26689e9907788d29d78f8b5c0212b31

+# We pull this in by graphical.target instead of waiting for the bus
+# activation, to speed things up a little: gdm uses this anyway so it is nice
+# if it is already around when gdm wants to use it and doesn't have to wait for
+# it.
+WantedBy=graphical.target

It still might make sense to start udisks on startup for ATA disk configuration restore. Not sure whether it is still a valid use case nowadays.