storaged-project / udisks

The UDisks project provides a daemon, tools and libraries to access and manipulate disks, storage devices and technologies.
http://storaged.org/doc/udisks2-api/latest/
Other
339 stars 142 forks source link

Please let me configure the housekeeping interval or otherwise unbreak externally configured spindown #407

Open anordal opened 6 years ago

anordal commented 6 years ago

Let's solve this issue.

I have 2 old Western Digital IDE harddisks that won't spin down when udisksd is running (unless I set their spindown timeout really short). I've had to:

sudo kill -SIGSTOP udisksd

and

sudo kill -SIGCONT udisksd

in order to hang/unhang the daemon as needed.

I tested modifying usisksd to set its housekeeping timeout really high (way outside the configurable spindown timeout for good measure) and that works for me:

--- a/src/udiskslinuxprovider.c
+++ b/src/udiskslinuxprovider.c
@@ -660,7 +660,7 @@ udisks_linux_provider_start (UDisksProvider *_provider)
   udisks_info ("Initialization complete");

   /* schedule housekeeping for every 10 minutes */
-  provider->housekeeping_timeout = g_timeout_add_seconds (10*60,
+  provider->housekeeping_timeout = g_timeout_add_seconds (13*60*60,
                                                           on_housekeeping_timeout,
                                                           provider);

A less hardcoded solution would be highly appreciated, so I don't have to run a custom version of udsiksd. Maybe some sort of blacklist of disks not to poll too often, as a configuration file.

dp-alvarez commented 6 years ago

I also have this problem, HDD spindown only works on timeouts <10min which is too short.

There is some discussion going on here: https://bugs.launchpad.net/ubuntu/+source/udisks2/+bug/1281588

dark-penguin commented 5 years ago

I think this exact patch is sensible and should be accepted. Maybe they will notice it if someone makes a PR ?..

Rationale:

dark-penguin commented 5 years ago

Seems like this commit that was supposed to fix this problem does not work for some reason. If it could be fixed, that would be a perfect solution.

dark-penguin commented 5 years ago

I have found the problem.

If I disable smartd, and set the sleep timeout via per-device config files in /etc/udisks2, then udisksd indeed does not check those devices if there was no activity on them since the last check.

If I remove those device-specific config files and set the sleep timeout via smartd instead, then smartd seems to be unaware that those devices are actually set to suspend soon, and so it checks them.

Now the situation seems obvious:

anordal commented 5 years ago

If I set the sleep timeout via per-device config files in /etc/udisks2, then udisksd indeed does not check those devices if there was no activity on them since the last check.

So that is how the spindown timeout is supposed to be configured nowadays … I feel mind blown and betrayed at the same time. Running hdparm at boot was a solved problem – I did not have the fantasy to research other ways.

You don't even need smartd to reproduce this problem – I'm not running it (and I have that supposed fix). But you have a good point that there could be other daemons poking the disks in tandem with udisks, thereby violating my assumption that a polling interval above the maximum configurable spindown timeout would be safe.

I totally agree that it can't just break sleeping scheduled by hdparm or smartd. Because that's surprising. Alternatively, the surprise part must be fixed, by documenting it as a deficiency along with its workarounds.

dark-penguin commented 5 years ago

The most ridiculous thing is that there is no way to configure SMART check frequency. As I understand it, this is desktop-oriented software: it does not send email reports like server software, and it handles things like automount. But at the same time, it insists on checking SMART data every ten minutes.

I don't even understand why does it need this data. smartd needs it only to notify the user, but I certainly don't want any software reacting on my SMART state in any way without my permission. At the same time, the suggested way to configure sleep timeouts with smartd is "set a SMART check timeout greater than sleep timeout". Which means, it's fine to only check SMART once in a few hours, as it's recommended by the "serious" server software that actually needs it for a reason. So I believe checking SMART once a day by default should be fine for a desktop user. If they know what SMART is and want to check it more often, then they will configure their system to do so - either by smartd, or by configuring udisks2 (oh wait, there is no configuration option for that!).

For now, the best solution I could find is disabling udisksd. You can't uninstall it because gvfs depends on it, and the whole desktop environment depends on gvfs, but at least you can disable it. Seeing that I could not find any information about what else does it do and why do we need it, and having already caught it red-handed doing shady stuff, I think I would actually be safer disabling it completely.

Another example of udisks2 misbehaving in the past is automounting everything without thinking, for example drives already mounted on virtual machines. That's the kind of automation I certainly don't need; if I need things mounted, I'll do that myself. So I guess disabling the daemon will do more good than bad to protect my drives...

dark-penguin commented 5 years ago

So, now we see that this has nothing to do with "out-of-spec" hard drives. All hard drives are affected. I guess we can change the name of this issue to "please provide a way to stop udisks2 from breaking smartd and hdparm functionality".

anordal commented 5 years ago

at least you can disable it

Not an option if you use any KDE software, or KDE itself, as they will mystically hang forever on startup if udisks2 isn't answering on D-bus. That was why I had to SIGSTOP/SIGCONT udisks2 to let it run only when I needed it to, without stopping and starting the daemon (as that would spin up everything).

tbzatek commented 5 years ago

Not an option if you use any KDE software, or KDE itself, as they will mystically hang forever on startup if udisks2 isn't answering on D-bus. That was why I had to SIGSTOP/SIGCONT udisks2 to let it run only when I needed it to, without stopping and starting the daemon (as that would spin up everything).

That looks like a bug somewhere in KDE. Please file a bugreport there, udisks clients are supposed to handle blocking calls asynchronously and should not block the rest of the desktop. There may be many possible scenarios of delays, e.g. waiting for CD-ROM drive to spin-up.

tbzatek commented 5 years ago

I don't even understand why does it need this data. smartd needs it only to notify the user

And that's exactly the use case SMART monitoring has been implemented in udisks. You don't need smartd configured or installed on your system to be notified on desktop that any of your drive is failing. There are several related plugins in gnome-settings-daemon that are monitoring various kind of resources, not only health of you physical disks.

but I certainly don't want any software reacting on my SMART state in any way without my permission.

Any other software is free to "react" on SMART data.

Another example of udisks2 misbehaving in the past is automounting everything without thinking, for example drives already mounted on virtual machines. That's the kind of automation I certainly don't need; if I need things mounted, I'll do that myself. So I guess disabling the daemon will do more good than bad to protect my drives...

Please open a separate ticket on this issue. There are more parties involved in automounting that carry the actual automounting policies, udisks usually acts only as the executive party doing the real mounting job.

Still such scenario shouldn't happen. If the drive is mounted on a virtual machine, it is a responsibility of the VM to lock it exclusively in the first place.

tbzatek commented 5 years ago

Thanks for opening #668, it helps keeping the discussion separated from a real RFE. Let's continue with the polemic here.

So tweaking housekeeping interval or making it user-configurable is most likely ineviatable anyway. The problem is there are different kinds of housekeeping in udisks, ATA SMART monitoring being just one of them. With the introduction of modules we also perform housekeeping for each one of them and each module may actually perform multiple (unrelated) tasks. Moreover this is currently tied to a single interval, even for modules. Fortunately for us the module interface is not a public API and we don't support out-of-tree modules, making necessary modifications easy.

More work is needed here and this should be first thoroughly thought through.

dark-penguin commented 5 years ago

at least you can disable it

Not an option if you use any KDE software, or KDE itself, as they will mystically hang forever on startup if udisks2 isn't answering on D-bus. That was why I had to SIGSTOP/SIGCONT udisks2 to let it run only when I needed it to, without stopping and starting the daemon (as that would spin up everything).

That sounds like something depends on it, which systemd does not make easy to troubleshoot... I just tried masking udisks2 on Kubuntu Bionic (the only test machine with KDE that I have), and it booted fine. Still, usually systemd should say what exactly are we waiting for.

I don't even understand why does it need this data. smartd needs it only to notify the user

And that's exactly the use case SMART monitoring has been implemented in udisks.

I thought so! Then it is certainly no big deal if it only checks your disks only once a day (and every bootup), and certainly there should be an easy way to disable this functionality?.. Especially if the drawback of not having a way to disable it is this serious.

but I certainly don't want any software reacting on my SMART state in any way without my permission.

Any other software is free to "react" on SMART data.

I thought it might be doing something else other than notifying the user, which I certainly would not want - it is a scary thought that some software could do something with my disks without any way to disable it!

Another example of udisks2 misbehaving in the past is automounting everything without thinking, for example drives already mounted on virtual machines. That's the kind of automation I certainly don't need; if I need things mounted, I'll do that myself. So I guess disabling the daemon will do more good than bad to protect my drives...

Please open a separate ticket on this issue. There are more parties involved in automounting that carry the actual automounting policies, udisks usually acts only as the executive party doing the real mounting job.

Still such scenario shouldn't happen. If the drive is mounted on a virtual machine, it is a responsibility of the VM to lock it exclusively in the first place.

I remember seeing this issue somewhere; as I see here, it was apparently fixed before udisks2 .

The general issue here is that if some software is not possible to "simply not use" due to being a core part of the system, then its functionality should be really well documented and very tunable. the udisks2 man page is very short, with very few functions described, so it is basically undocumented and unconfigurable. (I would be happy to be proven wrong about it, or help change it!) When people think about this thing touching their disks, a panic attack is imminent! :) Then you start googling, and see even more potentially dangerous issues in the past... Since this is indeed more if a general "polemic" issue than a specific RFE, and the one people will find first, I thought that it would be helpful to post my findings about what options we have for disabling udisks2 and why you might want to do it. Is there any document where we could read about everything udisks2 does (other than the source code) and how to disable specific parts of it?

Back to the issue at hand:

So, the best course of action I could think of is:

If somebody is willing to put more effort into this than providing an easy workaround (and I'm not necessarily saying it's worth the effort), then consider the following algorithm as a starting point:

Treat all drives like they are going to sleep, but even more carefully:

This way, we can synchronize up to one minute to smartd or anything else, which should be fine in most cases, if not all of them. This would require separate countdowns for each drive; is that going to be a problem?..

dark-penguin commented 2 years ago

Could this be a configurable option? If we could specify the default timeout in udisks2.conf, this would actually be a fix rather than a workaround to make the problem harder to notice for most people.

I often swap drives in some machines, so specifying it per-drive makes this something you need to remember to do. And I want a 3 hours timeout, so 1 hour is still not enough, I still have to configure it.

EDIT: #668 is asking to have a way to configure the sleep timeout, and this is asking to have a way to configure the housekeeping interval. Either one would help - and wouldn't it make more sense to have parameters configurable rather than hard-coded?..

anordal commented 2 years ago

Note that the bump to 1 hour housekeeping interval is a distro patch – you only get it if you use PLD Linux.

And I'm still using Weston instead of KDE, 4 years later, because I can't be bothered to write that config file.

anordal commented 1 year ago

I stumbled over this, that can explain my observation, that KDE programs, and KDE itself, hangs waiting for udisk: https://blog.broulik.de/2022/11/performance-musings/

The most important thing to remember with Qt DBus: Never use QDBusInterface. This innocent-looking class does a blocking introspection of the interface in its constructor! (…) On my laptop, I was able to speed up Dolphin’s startup by 50ms just by removing some QDBusInterface usage in Solid (the Framework which enumerates storage devices).

lockie commented 1 year ago

Solved this annoying problem by apt purge udisks2. Good riddance.

nlgranger commented 1 year ago

Udisks2 is a dependency for many packages, you can however disable the service, for systemd: systemctl disable udisks2 and for systems that 'hard-code' enabling udisks2 service you can mask it: systemctl mask udisks2.

dark-penguin commented 1 year ago

But that also removes its useful functionality - I don't remember exactly what was it, but there was something useful about it. Something about mounting external media, not to mention its hard drive housekeeping and apparently notifications about bad SMART, if it really does that.

lockie commented 1 year ago

I'd rather mount USB sticks manually than have my 4TB storage disk thrashed, thank you very much.

tue-kyndal commented 1 year ago

I also have the same problem. I can not spin down my disks using hdparm service unless I use parameters less that 10 minutes. Its most likely because of the described problem with Udisk2. HOW do you configure Udisk2 to stop checking SMART data with such a high frequency, that it disables other very important disk management systems like spinning the disk down with a sensible interval like 30 min etc. I struggled with hdparm and the systemd services that load this service for weeks, before I read about the issue with Udisk2., and im stunned the read this post. Its been going on for years now, and its still a huge issue.

Will the programmers behind Udisk2 please take this seriously!

Its a mayor issue and very very annoying and frustrating!!!

petermolnar commented 1 year ago

One more here. This is one of the most annoying behaviours I've seen so far. There's a thread at https://bugs.launchpad.net/ubuntu/+source/udisks2/+bug/1373318 which mentions https://launchpadlibrarian.net/186132339/AdjustableHousekeeping.patch as a possible solution, so I'm wondering why is that patch not applied and included yet.

pothos commented 1 year ago

Comment from a bystander: You could open a PR with that but maybe it's not as simple as that, assuming that the last statement on this topic is still valid:

More work is needed here and this should be first thoroughly thought through.

dark-penguin already put some thought into this issue but more of scratch your own itch is needed in the path forward: someone has to sit down and do the hard work…

zljubisic commented 1 year ago

I am experiencing the same problem with udisks2 and 10 minutes timeout. In my case if I set spin down to 9 minutes with hdparm -S 108 /dev/sda disk still doesn't spin down.

The only solution so far that worked is by using this script: https://github.com/ngandrass/truenas-spindown-timer/blob/master/spindown_timer.sh started like this: spindown_timer.sh -v -t 540 -p 60 -m -i sda

Also I tried to solve problem with:

cat /etc/udev/rules.d/80-udisks.rules
KERNEL==“sd*[!0-9]”, ATTR{removable}==“0”, ENV{ID_BUS}==“usb”, ENV{DEVTYPE}==“disk”, ENV{UDISKS_DISABLE_POLLING}=“1”
KERNEL==“sd*[!0-9]”, ATTR{removable}==“0”, ENV{ID_BUS}==“ata”, ENV{DEVTYPE}==“disk”, ENV{UDISKS_DISABLE_POLLING}=“1”
KERNEL==“sd*[!0-9]”, ATTR{removable}==“0”, ENV{ID_BUS}==“scsi”, ENV{DEVTYPE}==“disk”, ENV{ID_VENDOR}==“ATA”, ENV{UDISKS_DISABLE_POLLING}=“1”

but it didn't work as well.

Interestingly, this 80-udisks.rules solution is not working on raspbian os, but it works on osmc os. Looks like guys from osmc have patched something correctly. My disk is WD Green 4TB.