bus1 / dbus-broker

Linux D-Bus Message Broker
https://github.com/bus1/dbus-broker/wiki
Apache License 2.0
677 stars 80 forks source link

quota limits #236

Closed stevegrubb closed 3 years ago

stevegrubb commented 4 years ago

I was using my desktop system when suddenly I was logged out. After a while my system let me log back in. I found this in my logs:

Aug 1 13:08:22 x2 journal[1751]: UID 1000 exceeded its 'bytes' quota on UID 1000

That is all the information I have go on. I wrote a script to search every binary on my system to find out which program sent it. It would be nice if this message at least gave some information about which program sent it and how much quota I have and what kind of quota it is (what does bytes mean?). And how would I increase the quota if that is needed?

And then, is tearing down a user session really the right answer? I now have a corrupted akonadi database to repair. I have no idea how I exceeded the quota or what to do about it or which applications caused it. Is it possible to log more information? If its related to messages, maybe its best to block reading inbound until some outbound space clears? I'd rather see the system get slower than exit.

teg commented 4 years ago

More information is available in the journal about the failure in the structured log entries.

stevegrubb commented 4 years ago

OK. Thanks for the hint. I found this.

DBUS_BROKER_USER_CHARGE_USER=4325
DBUS_BROKER_USER_CHARGE_ACTOR=4325
DBUS_BROKER_USER_CHARGE_AMOUNT=728
DBUS_BROKER_USER_CHARGE_N_ACTORS=1
DBUS_BROKER_USER_CHARGE_SLOT=bytes
DBUS_BROKER_USER_CHARGE_REMAINING=414
DBUS_BROKER_USER_CHARGE_CONSUMED=276446818
MESSAGE=UID 4325 exceeded its 'bytes' quota on UID 4325.

So, what does this mean? How do I fix this so that it keeps working and not log me out?

teg commented 4 years ago

The UID in question has ~250MB of pending data in the broker and has reached it's quota. The quota is a hard limit, and rate limiting is not supported.

A client causing a user to exceed their quota is not supported. @dvdhrm might have more tips for how to track down the offending app.

stevegrubb commented 4 years ago

Does dbus broker have any idea who the client is and who is receiving it? (I have no idea, just asking.) Because that might also be interesting to add to the logging. At the time, I was using akonadiconsole which is part of KDE's email client, kontact. It was sync'ing a 5Gb imap folder. However, crashing the session caused akonadi to mess up its database, restoring its database is causing problems. I filed this with KDE to see if they can resolve part of the problem.

https://bugs.kde.org/show_bug.cgi?id=424935

This is a vicious cycle where a session logout crashed a database, the database cannot restore cleanly and loops trying to restore. Either way, a hard limit that forcibly logs out a session is not the best choice. Perhaps disconnecting a client and flushing memory is better than killing the whole session. I don't even know if you know that is the effect of this error. But its a catastrophic logout of the user session.

As best I can tell, there is an akonadi server that talks to an external imap server, akonadiconsole which monitors the server and reports its status. If this comm backed up, no big deal, disconnect it. But killing the session caused mysql to be unceremoniously terminated, akonadiserver to be terminated, all desktop apps such as libreoffice which I was editing in, and anything else. If the comm between akonadiserver and akonadiconsole was problematic, just that should be shutdown.

dvdhrm commented 4 years ago

I was using my desktop system when suddenly I was logged out. After a while my system let me log back in. I found this in my logs:

Aug 1 13:08:22 x2 journal[1751]: UID 1000 exceeded its 'bytes' quota on UID 1000

That is all the information I have go on. I wrote a script to search every binary on my system to find out which program sent it. It would be nice if this message at least gave some information about which program sent it and how much quota I have and what kind of quota it is (what does bytes mean?). And how would I increase the quota if that is needed?

As Tom already mentioned, most of the information you asked for is available in the structured log entries. You can increase the quota by modifying your dbus configuration (see man dbus-daemon). However, so far all quota issues we are aware of were triggered by actual bugs in applications, so I doubt that increasing quotas solves the issue.

And then, is tearing down a user session really the right answer? I now have a corrupted akonadi database to repair. I have no idea how I exceeded the quota or what to do about it or which applications caused it. Is it possible to log more information? If its related to messages, maybe its best to block reading inbound until some outbound space clears? I'd rather see the system get slower than exit.

I very much prefer hard failures, as they make problems much more apparent. I am not saying that this is the only way to deal with these problems, but it is the model we follow and prefer. Continuing operation with limited capabilities will make debugging a nightmare, in my opinion.

Also note that we do not shut down a session. However, it is a known side-effect when a user exceeds their quota. For security and DoS reasons, any resource consumption in the system bus is accounted based on UIDs. If one UID exceeds their quota, only connections of that UID will be affected by any consequences. This, however, means that a misbehaving client application can affect operation of the client compositor or login session running as the same user.

Does dbus broker have any idea who the client is and who is receiving it? (I have no idea, just asking.) Because that might also be interesting to add to the logging.

The final operation that exceeds the quota is very likely not the operation at fault. That is, D-Bus clients can consume lots of bus resources over their lifetime, but only a single operation will eventually exceed the limits. We have a diagnostics call that will dump internal accounting state and thus allow better introspection on what exceeded which resources:

sudo dbus-send --system --dest=org.freedesktop.DBus --type=method_call --print-reply /org/freedesktop/DBus org.freedesktop.DBus.Debug.Stats.GetStats

This call is most helpful in combination with ps aux, so the PIDs can be attributed to processes. However, this call will not expose any usual information after a session was closed, since most resources have been released then.

This is a vicious cycle where a session logout crashed a database, the database cannot restore cleanly and loops trying to restore. Either way, a hard limit that forcibly logs out a session is not the best choice. Perhaps disconnecting a client and flushing memory is better than killing the whole session. I don't even know if you know that is the effect of this error. But its a catastrophic logout of the user session.

We are neither disconnecting a client nor forcibly closing a session. We simply refuse further resource allocation if the limits are reached. This is a DoS protection so one user cannot consume infinite shared resources and thus affect operation of a different user.

If resource accounting is not required, you can always increas max_connections_per_user in your dbus configuration to a maximum (2^64), which will have the desired effect of also increasing all other per-connection resource limits.

As best I can tell, there is an akonadi server that talks to an external imap server, akonadiconsole which monitors the server and reports its status. If this comm backed up, no big deal, disconnect it. But killing the session caused mysql to be unceremoniously terminated, akonadiserver to be terminated, all desktop apps such as libreoffice which I was editing in, and anything else. If the comm between akonadiserver and akonadiconsole was problematic, just that should be shutdown.

As I said before, there is no special teardown logic involved. All the mentioned applications should be able to save their state when a dbus operation fails. However, if these applications rely on D-Bus to save their state, then a failing D-Bus operation will obviously have data-loss as a consequence.

dvdhrm commented 3 years ago

I am closing this, as this is the expected behavior. Feel free to re-open this, or open a new issue if the problems persist. Thanks!