ChatSecure / ChatSecure-iOS

ChatSecure is a free and open source encrypted chat client for iOS that supports OTR and OMEMO encryption over XMPP.
https://chatsecure.org
Other
3.13k stars 1.03k forks source link

OMEMO key-trust issue? similar or equal to #1046 and or #1067 #1106

Open clindner85 opened 5 years ago

clindner85 commented 5 years ago

Hi

I'd like to bring an issue with OMEMO to your attention. To me, it looks like it is a time-based issue with the OMEMO key-trust. To prove this is an issue with OMEMO (implementation), the below described issue has been counterchecked with a non-encrypted MUC where everything was working like a charm. Sorry for the long issue description, but to me it does not appear do be less complex to identify or to explain.

Description: After an unknown timeframe of inactivity (which can be hours to days), chatsecure/zom/monal users do not receive messages in OMEMO secured MUC groups anymore (not talking about push).

Presumably since the last update of chatsecure, I'm seeing this issue. I use the term "Presumably" because I did not recognize this before the last update. But that does not intentionally mean, that the issue wasnt there before.

The issue described below has been tested back and forth for several weeks to try to identify a root cause. So I'm finally showing up here.

To ease the below description and keep it shortest; Conversations users do not face an issue at any time!

I testen on several Test-Setups with each setup having Push (XEP-0357) active. Setups: Firends Prosody only with 2 Conversations users and one Chatsecure user Firends Prosody with 2 Conversations users --> s2s to my separate prosody which is homing the chatsecure user Firends Prosody with 2 Conversations users --> s2s to my separate openfire which is homing the chatsecure user Firends Prosody with 2 Conversations users --> s2s to my separate ejabberd which is homing the chatsecure user My Prosody only homing all My openfire only homing all My ejabberd only homing all

result is always the same: --> Users on Apple IOS with Chatsecure, ZOM or Monal are experiencing the same issue on all of the above setups while Conversations users do not.

At some point, the User on Apple-IOS Platform is not receiving messages anymore. ZOM is recognizing that something is incomming, but cant decrypt it (mentioning this because this is the only hint by zom debug, and as far as i know, zom is a derivate of chatsecure whereas I'm seeing this on all OMEMO enabled IOS Apps)

Sometimes, only one users Messages are not being received while the other users messages are being received. When I check the MUC Users key trust-age, the older the trust, the less likely it is that the Apple device receives the messages in the OMEMO secured MUC.

I even tested with additional users. where e.g. userA was on Chatsecure and userB was on ZOM while userC was on Monal. The issue is the same, at some point, one, two or all of them are not receiving OMEMO encr. messages. Conversations is still fine. Even between apple users (with CS, ZOM or Monal on the same device) in the above setups, one is receiving messages, one isnt.

The only hint I have so far... When it happens on ZOM, ZOM is showing the notification, but there is no message to read. If more messages are incomming, the issue repeats. Whenever this happens, I'm seeing such of the following "No Session" message in my zom debug logs: 2019/04/05 17:51:10:728 Error decrypting OMEMO message for ralf@example.com: Error Domain=org.whispersystems.SignalProtocol Code=10 "No Session" UserInfo={NSLocalizedDescription=No Session} I sent you an OMEMO encrypted message but your client doesn’t seem to support that. Find more information on https://conversations.im/omemo

MwohBcTl+XdsjfGs+1mpGTmRlDjggmbMidT2HDU5MuSEPGBtEAoYBiIwx3wyWGF1OlNXCwrkIUB0pvyYDDcEkLDCZxMeoR2R0K/Y1yOaVkS7BevSmbFspo2gDo3Eech8DC8=MwohBXK0Guo5GR0ctEfWhhRTBPvu8JB7k/rUjvC7B/ymAFY/EAMYACIw6qvihr0gByk+QNpFk9/Pe1+wQcCq3118aw2giYM+UtoaCO+38ZBuf4r9PUuLnVRk0xX5ellF5dY=MwohBfQeSpjNHQCwmJ3ABBzCBhj73dAE3lCsW10v15vVGM89EBUYACIwrYD4ulGD+F1xY+b38USFL9sn9FOSPY1WimiNeyrJcMQjUkN3hJinYLwmfNRy622bN+E6Pexxv+w=MwohBbcqT3Z7nwlRgUJoIhD2WULDkisdgtYL3y7SR71lOEgfEDAYACIwde5dKsE+pZah/QX3p6oXqc7l4m/Zo/Zfl2JI34ODdfYQRY14+7gH3/4Jfm7kmeYoTzeZpqq2qhQ=MwohBdY1AlG7UvVcvFIaNg9zHAHqUUoXOTngT+334KDleF4UEBoYASIw+sKswMXs3HnSD4hngdiSnfwVvM+7YjFof1T+ImpIM+0MARjTh9I1InhE6B2+INg97xGmxbKTDx0=MwohBQw3ysmyx/F6IpVjDvLZE6fSmNU/dJUqo5ndCIMmuVM+ED4YACIwyvpalQXdHAOrTg3dpwdoj8keSHdWRt/XxBCIMCbK3n5/EDgvzWe7pn4Oq2pxjFz0nym2Bqurabs=MwohBXH1mA4l5Xx3o2Ji38AFnDYSJHaY/ZXKnpi6vGhMCadPEAAYACIw8CGDj98La+jDvNypE5yLvzp8hEwdmZchpiV1wqq/WBh9W0AZlNYURg6eZgisYPPSk8oZRhp7ado=MwohBefY5adYWb3shY7FTDwYT7AD+9IE8hrDcZtFZjzgp51JEBwYAyIwnt/oPfJe+yhRKLfwK/0f/JANjZ/XOEfBT7v8bWzJqr8dH/9oICwwaYP933X7elQTMmD76jJwqC4=MwohBaiZ5dQGWVekej7gCyU+IGwzkFCrzeCEhhsgamECxXR/EGQYACIwQsBSiy7dslhfJcc7HIo7hRXM+0AaSspVpXXWwS4l3Y//0ATy8jLVH4jrlK9rTZhnBXTNd59OyEY=MwohBRcH7Wm1GUCwlRdZ6I/wM27hZ7tRGWSVRfIFUdGK1BZLEBcYACIw1EZxetySg4/yEGeffDvp30COmh3O25OXfk2iUJeQ5zMOVyGfMEzuTcc8JSlBy5tmweMje3BKBRg=MircqhXcMPFuebcKmAYxKw==
LCX6DdbyB8YUbjysJVumAcd3zgwa3c1OsRbbymOSB02qRcautTysIz15Rog0MlHefsWLAbhkwAEKsW0j34MQ/n7ycP1fuV3CvCcRi6kWPewLCZOqBIot61N+qNItJqq5N3NscNoW

--> ralf@example.com is a conversations user!

I only have one MUC, so its easy to find out where messages are incomming. If I open the MUC and look at each user, the keys are no trusted since hours or days. Me sending one messages to each user in private re-trusts the keys (trused x seconds ago) and everything is fine again for MUC messages.

I hope this is clear to understand. please let me know if you need further clarification or further information. I'd love to see this issue to be solved. Thanks!

mimi89999 commented 5 years ago

Hello,

This issue has been reported to me 5 times already, but I didn't open an issue because I'm not able to tell anything about it. Thanks for opening the issue. I was trying trying it reproduce it for several weeks on my iPhone emulator and on my iPads, but without any success. Could you please help me to reproduce it on my test devices? Do you know what might cause it?

To me, it looks like it is a time-based issue with the OMEMO key-trust.

To me it doesn't look like a time based issue. One message was lost only several hours after the previous message. In another case it was only 2 days after the previous message.

I also don't think it is related to MUC. In all the cases, my contacts were members of an OMEMO MUC, but in at lease 2 cases the first message lost was a 1:1 message.

What's interesting is that this issue also affects Monal because they don't seem to share much of CS code.

chrisballinger commented 5 years ago

If this affects Monal too then it makes me think it is a problem in the SignalProtocol-ObjC library

clindner85 commented 5 years ago

would also match the error i see in debugs.

Error Domain=org.whispersystems.SignalProtocol Code=10 "No Session" UserInfo={NSLocalizedDescription=No Session}

clindner85 commented 5 years ago

Hello, ...

@mimi89999 I don‘t know how to trigger this on purpose. It just happens. With the explanations from your reply, I also don’t think by now, that it is sort of time-based.

@chrisballinger is there any way to enable Chatsecure to log debugs? I‘m happy to help solving this.

Sent with GitHawk

mimi89999 commented 5 years ago

I really don't like this: https://github.com/ChatSecure/ChatSecure-iOS/blob/2137a954d3a64954824f63a446edf19dd4dde12d/ChatSecure/Classes/Controllers/OTROMEMOSignalCoordinator.swift#L530-L545

Basically, if we get any other error that duplicateMessage we immediately proceed to deleting the session. I'm not familiar with SignalProtocolObjC, but I saw that there are many different errors defined. Maybe other errors don't necessarily mean that the session is broken? Maybe we could change that to just throwing a big error and letting the user reset the session? Also, it's quite normal in this case that all future errors will be No Session and they won't explain what really happened.

chrisballinger commented 5 years ago

That was originally added to work around a bug where the session would get stuck in a broken/corrupted state, so it doesn't really fix the underlying problem of the session getting corrupted in the first place.

mimi89999 commented 5 years ago

Would it be possible to add code that woul send a bug report when a broken session is identified?

bastei commented 4 years ago

I experience similar issues in 1:1 chats on the same XMPP server: After some ours of inactivity OMEMO messages sent by Conversations are not displayed in ChatSecure anymore. The problem can only be resolved by sending an OMEMO message from ChatSecure to Conversations (until the next period of inactivity).

It seems to be reproducible for me. I have an iOS device at hand own the XMPP server. Is there any possibility to get a version of ChatSecure that prints debug messages to the console? I also have a MacBook with MacOS 10.13.6 at hand but I'm in no way familiar with MacOS... Latest XCode seems to be unsupported with that version. If I get assistance I'm willing to build, debug and possibly fix the bug by myself.

Other observations that possibly cohere with that issue:

bastei commented 4 years ago

Turns out that publishing device ids fails with my Prosody. XMPPFramework sets access_mode to open which Prosody (0.11.2) answers with a precondition-not-met error. Unfortunately XMPPFramework stops here. If I understand it right it should actually do Node configuration instead.? So this seems to be a bug in XMPPFramework.? At least I was not able to get my Prosody instance to accept access_mode: open...

However, if I remove access_model from the publish-options, my device id gets published and all my observations are fixed.

If I'm not wrong, processKeyData in ChatSecure fails without the published device id, because the other side did not encrypt for the current registrationId (because it was never published). I think it should at least print an error message, even better show a message or notification to the user if that happens.

chrisballinger commented 4 years ago

@bastei Thank you for the detailed bug report. This does look like an issue with XMPPFramework. The OMEMO code in XMPPFramework is from an early revision of the spec and hasn't been updated in a while. I don't know why your Prosody instance is having issues because my personal server also runs the latest version without issue.

bastei commented 4 years ago

I investigated a little more on why my Prosody behaves that way and was successful: In the pep data of the affected users, the config for devicelist was empty. After stopping Prosody and manually filling the configs with access_mode: open everything works like it should.

I think either there are other clients which don't set access_mode per default or the bad pep data came from a (unclean) migration from Prosody 0.10. Hopefully it's the latter...

Unfortunately my 'fix' can only be done as an administrator and after really knowing what is going on. So as a result the two ideas for improvement remain:

chrisballinger commented 4 years ago

@bastei Thank you for the deep dive! We should open up new issues on XMPPFramework and ChatSecure so they can be tackled separately.

GlacierSecurityInc commented 3 years ago

We've seen this issue too, and those related (such as #1165, #1067). In tracing this out, it seems that one of the reasons sessions keep getting destroyed is because of older messages with broken sessions that we can get again and again due to MAM. In particular I saw this coming from a group. Even if sessions have been recreated, when we receive the older message with the error of something like a "SignalErrorInvalidMessage", and the session gets deleted again.

In looking at the database, it seems that the record would be overwritten with a new session. Do we really need to delete the older broken session? Would it be possible to just re-send the bundle to initiate session creation and have any new session info overwrite the old? Having re-visited this recently, we are not currently deleting the sessions in this case so that newly created sessions don't get blown away.

With OMEMO issues, I'm never 100% sure if what we are doing is ideal, but we are doing several things to handle missing/dropped messages.

If there is 1) no session (SignalError.noSession) or 2) session exists but there was an error, or 3) there are missing keys (which seems to be the case for the else where "guard var aesKey = unencryptedKeyData else", we do the following:

1) Post a message to the user letting them know there was an issue and message was not correctly encrypted for that device. This keeps the user from getting ghost pushes. They at least see a notice that something went wrong in that conversation

2) Fetch and publish our deviceIds. The thought behind this is that if the sender didn't correctly encrypt to our device, they might not have our updated keys, so we republish them. (This is set so it only publishes once if you open the app after a while and receive a lot of messages that can't be decrypted all at once)

This could be overkill, but in our case, we also send a message back to the original sender letting them know something went wrong and that we are trying to re-sync keys.

Neustradamus commented 2 years ago

@chrisballinger: Any news about this bug?