element-hq / element-meta

Shared/meta documentation and project artefacts for Element clients
66 stars 11 forks source link

Unable to decrypt cause: sometimes the key backup doesn't contain the key you need #2327

Closed richvdh closed 1 month ago

richvdh commented 3 months ago

Sometimes users log in on a new device, but despite verifying the device and successfully connecting it to key backup, necessary keys cannot be found, so messages remain undecryptable. Potential causes might include:

This failure mode is hard to debug (since we normally only get a rageshake from the new device which can't get the keys from backup), and hard to manage user expectations for.

richvdh commented 3 months ago

A likely realistic scenario for this:

mcg-matrix commented 3 months ago
  • User closes their laptop at 17:00 Friday
  • At 10:00 Saturday the user gets a call about an urgent issue; they log in on their mobile device
  • They are unable to decrypt any messages between 17:00 and 10:00

Should a "dehydrated device" be the solution for this scenario? (https://github.com/element-hq/element-meta/issues/922?)

richvdh commented 3 months ago

Should a "dehydrated device" be the solution for this scenario? (https://github.com/element-hq/element-meta/issues/922?)

My understanding is that dehydrated devices only help if there are no logged-in sessions; in this scenario there is a session, it's just not active. @uhoreg am I right, or confused?

uhoreg commented 3 months ago

Dehydrated devices may help, depending on how we implement it.

With dehydrated devices, when you log in, you always rehydrate the device when you log in, so you will receive any keys that were sent to the dehydrated device. So even if you have other devices, you can use the dehydrated device to get keys that were sent before you logged in (and after the dehydrated device was created).

However, we are looking into whether we should avoid sending keys to dehydrated devices if the user has other devices available as a way to reduce the number of to-device events they will accumulate and reduce the chances of using up all the OTKs. So if we do something like: "don't send keys to dehydrated devices if the user has any other devices", then it won't help in this scenario. If we do something like: "don't send keys to dehydrated devices if the user has fewer than n other devices", then it may help.

richvdh commented 1 month ago

This issue isn't very actionable as it stands.

  • The key was never received at any other verified client. (Perhaps there were no other verified clients. Or perhaps there were other bugs.)
  • The key was received at another client, but the client was shut down/turned off before it had a chance to write the backup (see also https://github.com/element-hq/element-web/issues/27267 which exacerbates this)

I've opened https://github.com/element-hq/element-meta/issues/2421 to track these problems.

We should track these separately under their own issues.