element-hq / element-x-ios

Next generation Matrix client for iOS built with SwiftUI on top of matrix-rust-sdk.
https://element.io/labs/element-x
GNU Affero General Public License v3.0
409 stars 94 forks source link

Queuing failed badly on bad connectivity #2973

Closed ara4n closed 2 months ago

ara4n commented 3 months ago

Steps to reproduce

  1. Was on the tube; zero connectivity
  2. Tried to send a message
  3. It immediately hard-failed (red circle), rather than queuing.
  4. I automatically tapped the red-circle to try to force a retry (or to get it actually queue, rather than hard-fail)... except there's no retry button; only a remove button :/
  5. Later, going back into the room when on good connectivity, the message is still there... except now it's in queued state (hollow-circle). Except despite now being on good connectivity, it doesn't send.
  6. Later still, the message has now vanished entirely from the timeline, and looks to have never been sent at all.

Screenshot from step 4, when it started to go wrong:

image

Outcome

What did you expect?

What happened instead?

Your phone model

iPhone 12 Pro Max

Operating system version

iOS 17.5.1

Application version

631

Homeserver

matrix.org

Will you send logs?

Yes

bnjbvr commented 3 months ago

It immediately hard-failed (red circle), rather than queuing.

Sending a message requires doing a /members request, and we try to avoid running more than one at the same time; so these requests are deduplicated; the first one failed, causing the second (waiting) to fail and be flagged as unrecoverable. There's a real bug there, and we should likely flag this specific kind of error as recoverable too.

That's the one new bug from looking at this rageshake, the rest has been fixed in the SDK, as far as I can tell:

Later, going back into the room when on good connectivity, the message is still there... except now it's in queued state (hollow-circle). Except despite now being on good connectivity, it doesn't send.

Here's, it's the timeline state that's incorrect; it should still have stayed in hard fail mode. This bug has been fixed in the SDK, but the app was using an old version of the SDK at this point that didn't include the patch. (SDK fix)

The app should never ever spontaneously discard unsent messages.

The app is using SDK's commit e89659b6, which didn't have the on-disk persistence yet, so an app restart would lose unsent messages. SDK patch.

bnjbvr commented 3 months ago

https://github.com/matrix-org/matrix-rust-sdk/pull/3619 should fix it.

pixlwave commented 2 months ago

Closing this as it should be included in Nightly by now and we've been testing queuing this morning.