flatpak / xdg-dbus-proxy

GNU Lesser General Public License v2.1
57 stars 21 forks source link

Revisit requirement on monotone increasing client serials #46

Closed A6GibKm closed 5 months ago

A6GibKm commented 1 year ago

https://github.com/flatpak/xdg-dbus-proxy/blob/b15c7b26d6d4dd4c947b1a2c836f598b75cbb5fe/flatpak-proxy.c#L2184-L2190 requires series to be monotonically increasing, for a client it might be impossible to ensure the serials are strictly monotonically increasing if sent from different threads.

zeenix commented 1 year ago

I agree with @A6GibKm here. This check/requirement is just wrong. The D-Bus spec says absolutely nothing about the serial number's order. They don't have to be increasing even, let alone monotonically increasing. I think the name is confusing. It's just a cookie to map method calls with their corresponding replies.

A6GibKm commented 1 year ago

By the way, when using decoder , closing and opening the camera has a certain probability to kill the portal altogether until the app is closed (with this error being displayed), so it might be breaking more apps.

alexlarsson commented 1 year ago

The proxy code relies on the serials to be monotonic, because we sometimes need to send our own messages from the proxy, and when we do this we bump last_serial and serial_offset, and then later we apply the serial_offset to new messages we get to make up for this.

For example, we may get from the client

However, this kind of rewriting breaks if the client can send reordered messages:

Now we have two message 2, which is going to be very confusing.

alexlarsson commented 1 year ago

A real fix for this would have to have a much more sophisticated tracking of which ids need to be rewritten and how much,

matthiasclasen commented 1 year ago

One idea could be to do some systematic reencoding - e.g. use odd serials for all things passed through, and reserve all even serials for our own messages

Something like:

if (message is fake) {
   message.serial = next_serial;
   next_serial += 2;
}  else {
  message.serial = message.serial * 2 + 1;
}

Would that work?

zeenix commented 1 year ago

The proxy code relies on the serials to be monotonic, because

There are many valid use cases for having a monotonically increasing serial number on the messages but just because we need them, doesn't mean we solve it by assuming a cookie to be what we need it to be. :wink:

FWIW we need ordering of messages in zbus too and we had to even create a new library for ensuring order of messages where needed. Unfortunately our solution won't work for your use case, as you need the serials encoded on the messages on the bus.

alexlarsson commented 1 year ago

@zeenix You're not wrong, but what I mean is that we can't just delete this check.

@matthiasclasen Yeah, that may work. It's equivalent to using the low bit for "fake". Alternatively we could use the high bit. Or we could start fake serials from MAXINT and count down.

I guess the main risk is that such transformations would exhaust the serial space. For example, message.serial * 2 may overflow if message.serial is large. In practice though, i think all dbus libraries start small and increment, so it should be fairly risk-free.

zeenix commented 1 year ago

@zeenix You're not wrong, but what I mean is that we can't just delete this check.

Gotcha.

I recently had quite a lot of trouble in busd code because the bus itself also has to provide the freedesktop interfaces and differentiating between own messages and others ended up with a lot of messy code. In the end, I decided to create a self-dial connection in the same process that provides those interfaces (i-e bus itself is also a client). This made things much simpler and eliminated the need for me to treat bus' own messages differently than that of other peers/clients.

I don't know if something like that would work for you but if it can, that would be a much less hacky solution than what you currently have and what you're thinking about IMO.

I guess the main risk is that such transformations would exhaust the serial space. For example, message.serial * 2 may overflow if message.serial is large. In practice though, i think all dbus libraries start small and increment, so it should be fairly risk-free.

All current libraries maybe and we can find that out to be sure but tomorrow someone can write a new library (e.g for another fancy new language) and they'll be within their right to (for example) start serial numbers from MAXUINT32 or choose some other scheme that will break proxy again.

The issue is based on your assumptions about behavior of D-Bus libraries so I'll suggest not to go for a solution that is again based on assumptions. Having said that, personally I only care about zbus here and as long as proxy doesn't break against zbus, I don't particularly care how you solve this.

matthiasclasen commented 1 year ago

All current libraries maybe and we can find that out to be sure but tomorrow someone can write a new library (e.g for another fancy new language) and they'll be within their right to (for example) start serial numbers from MAXUINT32 or choose some other scheme that will break proxy again.

It doesn't matter where you start. There's enough even and odd numbers in a 32bit integer to go around. Of course you'll need to handle overflow, but the principle stays the same

alexlarsson commented 1 year ago

I think it is likely that any dbus library implementation would use some kind of at least semi-continuos serial counter, rather than completely random. If nothing else that is the easiest way to avoid reuse. So, if we believe that a even/odd split or something like that is gonna be a problem then we can have a more precise remap tracking (say at the level of 128k serials chunks). However, in practice I think using the high-bit for internal serials is probably best. And, you don't even have to change the client message serials in that case (assuming high-bit==0 means non-fake.)

matthiasclasen commented 1 year ago

And, you don't even have to change the client message serials in that case (assuming high-bit==0 means non-fake.)

True, that is a nice point.

So instead of asserting that the serials are monotonic, we'll have to assert that the high bit isn't set.

alexlarsson commented 1 year ago

We might not even need the whole upper half of the serial space. Could just assert that its not in say to top 128k serials, and then reserve those for internal use.

rmader commented 1 year ago

Small heads up that this is now one of the last big issues blocking wider camera portal adoption (i.e. enabling it in Firefox and Chromium by default).

Edit: as in: I'd be super happy if this was solved by one of you - but will of course look into it myself if nobody else finds time/motivation :)

zeenix commented 1 year ago

Edit: as in: I'd be super happy if this was solved by one of you - but will of course look into it myself if nobody else finds time/motivation :)

I think this is the case unfortunately. Be the hero Robert. :)

zeenix commented 1 year ago

Just a heads-up, https://github.com/dbus2/zbus/pull/488/commits/2e0a8902decf9ac278803f5c7622b740b37829c4 (i-e which will be included in zbus 4.0) will probably make this issue even worse.

I'm really sorry for an inconvenience that may be caused by this change but honestly, zbus is still very much compliant with the spec here.

savely-krasovsky commented 6 months ago

Can this issue be linked to my bug? https://github.com/flatpak/flatpak/issues/5742 I am not an expert in D-Bus, but my app uses D-Bus heavily and it also activates new D-Bus clients. In my case only in ~30% of cases it works, otherwise I am getting Invalid client serial.

mcatanzaro commented 6 months ago

Almost certainly yes. Notably, you're using godbus, which I did not realize existed. So now we have two independent D-Bus implementations that both trigger this bug. That kinda ends the argument that this should be fixed in zbus rather than in xdg-dbus-proxy.

savely-krasovsky commented 6 months ago

Can someone explain how someone could workaround it? Even if it will require D-Bus library patch (zbus, godbus, etc.). From what I understood, I need to synchronize spawning new D-Bus clients so their client serials will increase monotonically.

In my case I would suggest that without patching openvpn3-linux (which I am trying to activate by calling org.freedesktop.DBus.StartServiceByName) I will almost always get this bug if openvpn3 D-Bus services are not alive.

zeenix commented 6 months ago

That kinda ends the argument that this should be fixed in zbus rather than in xdg-dbus-proxy.

That was never a good argument anyway since the bug is not in zbus at all. :) An argument could have been made that perhaps it would be easier to fix it in zbus but that's not the case either. Otherwise, I'd have "fixed" it a long time ago.

mcatanzaro commented 6 months ago

From what I understood, I need to synchronize spawning new D-Bus clients so their client serials will increase monotonically.

I suspect the serials are associated with messages, not clients. To workaround, you'd have to change how godbus computes client serials. It's probably easier to just fix the bug. Matthias proposes a scheme in https://github.com/flatpak/xdg-dbus-proxy/issues/46#issuecomment-1581271590.

savely-krasovsky commented 6 months ago

Well, godbus seems to generate serials "properly", it uses mutual lock to do it: https://github.com/godbus/dbus/blob/76236955d466b078d82dcb16b7cf1dcf40ac25df/conn.go#L834 But from what I understand, if flatpak app talks to "broken" D-Bus client (e. g. openvpn3-linux uses GDBus++) it will be also affected?

zeenix commented 6 months ago

Well, go dbus seems to generate serials "properly", it uses mutual lock to to it: https://github.com/godbus/dbus/blob/76236955d466b078d82dcb16b7cf1dcf40ac25df/conn.go#L834

This may not be enough. I think most (likely all) D-Bus libraries do that, including zbus. The issue is the sending bit: if there are multiple threads sending messages, thread B could end up sending its message with serial 2 before thread A gets to send its message with serial 1 first, even though thread A managed to get its message assigned its serial number before thread B.

savely-krasovsky commented 6 months ago

Well, I am surprised then how does this bug doesn't have a bigger impact. Any app which heavily uses D-Bus should meet this issue sooner or later.

zeenix commented 6 months ago

Well, I am surprised then how does this bug doesn't have a bigger impact. Any app which heavily uses D-Bus should meet this issue sooner or later.

Not all applications/APIs are multi-threaded. You probably won't have this issue with zbus either if you use it with a single-threaded Rust async runtime. Also, some APIs could be assigning the serial number just before sending off the message, while maintaining a lock on the socket.


In any case, I don't see much point in continuing this discussion. We know the issue and there is a proposal for a fix. It's mostly about someone implementing it now.

mardy commented 6 months ago

Just FYI, in SailfishOS there's this patch: https://github.com/sailfishos/xdg-dbus-proxy/blob/master/rpm/0003-Use-hash-table-for-mapping-reply-serials-between-con.patch

No idea why they never contributed it upstream, but it might be related.

mcatanzaro commented 6 months ago

Wow, it's unusual to see such a complex patch kept downstream. Good find.

spiiroin commented 6 months ago

No idea why they never contributed it upstream, but it might be related.

Something like: Because of reasons I needed https://github.com/sailfishos/xdg-dbus-proxy/blob/master/rpm/0002-Add-D-Bus-interface-for-querying-client-process-deta.patch - which did not work because (iirc etc) bus-serial - client-serial mapping in proxy using "serial offset between connections" was unable to handle bus to proxy method calls that are answered without involving round trip to client behind proxy (which definitely makes the serial mapping to be non-linear, but there might also be other reasons why that could happen) --> the "use hash table" -patch. It is generic, and might help with other reply-serial related issues but I'm not sure. And suitably calm "now, lets upstream things" -moment never came.

sophie-h commented 6 months ago

I'm not able to understand that patch from the code. Is the idea to remap client serials that have already been used by the proxy and are used by the client again and store that mapping? If so, that's probably the most elegant solution.

However, @matthiasclasen's suggestion to split serials into odd/even seems like the simplest and most robust solution to me. It makes it impossible to have collisions and avoids any kind of race conditions or clients starting with high serials. It also sounds pretty easy to implement.

sophie-h commented 5 months ago

Please ignore my last comment. I had some bug in my thoughts.

I have implemented the lower and higher bit solution. I'm currently running it on my system. Review would be appreciated.

zeenix commented 5 months ago

:tada:

savely-krasovsky commented 1 month ago

Can we prepare 0.1.6 release with this patch? It would be already adopted by some rolling release distributives.