matrix-org / matrix-js-sdk

Matrix Client-Server SDK for JavaScript
Apache License 2.0
1.55k stars 583 forks source link

Delayed membership responses in /sync cause UTDs #4291

Open kegsay opened 3 months ago

kegsay commented 3 months ago

This outlines a race condition in the CSAPI which can cause UTDs.

Consider:

In practice, clients will not encrypt for Bob, causing a UTD if you very quickly send an encrypted message after inviting a user. This can happen due to:

To fix this, we should be remembering that we, the client, have modified the membership state of the room, and invalidate the room member list (so we hit /members again). We can't assume that a 200 OK to /invite will guarantee that the user is in an invited state, so we still need to defer to the server.

A test for this is in https://github.com/matrix-org/complement-crypto/pull/98

richvdh commented 3 months ago

To fix this, we should be remembering that we, the client, have modified the membership state of the room, and invalidate the room member list (so we hit /members again).

Possibly, though the race isn't limited to this case: it's always possible for a membership change to happen in the room at the same time as we send a message, whether or not we caused that membership change.

I'd argue this is just a special-case of https://github.com/element-hq/element-meta/issues/2268 and a fix to that would fix this particular case.

(With that said, fixing this particular case is probably easier than a more general fix so may be worthwhile anyway.)

kegsay commented 3 months ago

it's always possible for a membership change to happen in the room at the same time as we send a message, whether or not we caused that membership change.

I agree, but that is fundamentally non-deterministic as senders/receivers don't synchronise in any way. For this issue, it can be and bots/scripts/apps expect it to be deterministic.