Open sleeksorrow opened 6 years ago
addition:
XMPP Console query gives:
<iq type='get' to='tester@myhost' id='test123'>
<pubsub xmlns='http://jabber.org/protocol/pubsub'>
<items node='eu.siacs.conversations.axolotl.devicelist'/>
</pubsub>
</iq>
<iq id='test123' type='error' to='daniel@myhost/work' from='tester@myhost'>
<error type='cancel'>
<item-not-found xmlns='urn:ietf:params:xml:ns:xmpp-stanzas'/>
</error>
</iq>
I hope this helps blaming the cause
Well, XMPP is very finnicky. If the devicelist is empty, that's correct behaviour. I assume the other clients keep the devicelist around for some time so they don't run into this. However, IIRC the XEP states that clients should check the devicelist and at least add themselves, and lurch does that at startup, so I'm not sure how there can be no devicelist. Is there anything suspicious going on in the server logs?
I see. So your theory is, the converse.js client would fail to publish the devicelist and the other working client running conversation has the devicelist cached and is using that. This would mean converse.js would once have published the devicelist before and then stopped at a later point. That would be strange, but since we are hunting a bug, this sounds plausible.
I tried to confirm that. converse.js saves its key in the local storage of the browser. So as soon as I am wiping this store, it has to create a new key with a new device id. I did that and the conversation client got the new key without problems and was able to send encrypted messages. Lurch did not get the new key and refused to send.
So sadly this looks like your theory is refuted. But it was a good thought nevertheless.
I checked the prosody error logs and did not see any suspicious entries.
So your theory is, the converse.js client would fail to publish the devicelist and the other working client running conversation has the devicelist cached and is using that.
No, my theory is that the XMPP server is handling PEP in a weird way and all the other clients that are not lurch work around it by not instantly deleting clients that are not contained in the devicelist. You confirmed this by manually sending a query to the XMPP server, and there was no devicelist. So how is lurch supposed to know the devices if the list is empty?
Maybe I should ask the other devs about how they handle this.
converse.js saves its key in the local storage of the browser. So as soon as I am wiping this store, it has to create a new key with a new device id. I did that and the conversation client got the new key without problems and was able to send encrypted messages. Lurch did not get the new key and refused to send.
Can you open the Pidgin debug window while doing that and tell me if any PEP updates arrive? Can you do a manual PEP query for the devicelist again once you confirmed it works in Conversations?
I wanted the test to be as clean as possible. So I:
Result is, that I cannot reproduce the problem anymore. OMEMO messages go back and forth successfully, PEP query succeeds and shows the device list.
I get different results now while repeating the above scenario. Now lurch has the correct fingerprint of the converse.js client, but messages sent to converse.js just don't show up now without visible error message.
Additionally when I use converse.js in a normal browser window without private mode, the fingerprint of this session is not available at the lurch side. Additionally in this window I cannot open up the own profile of the converse.js user where the own fingerprint is listed. I need to check for conflicting browser plugins or similar...
This will take some time to get a clue. Sorry and thank you for your help so far....
Okay at the moment it looks like if I log on and off often enough on both sides, then lurch gets the key from converse.js client. Then the error message about the empty device list ist gone.
But then if I send the message on pidgin/lurch, it just does not show up on converse.js. At the same time, a message from conversations does show up on converse.js.
Since this looks lke a different problem, feel free to close this one and advise me if I should make a new issue on converse.js side or here.
Regarding the device list I found out: If there is a conversation to a new conevrse.js client, the device list only fills if
So I think perhaps lurch needs more triggers for when to ask for the device list. Maybe try fetching device list:
Sorry, I lost track of this.
(maybe) whenever a contact comes online (might be dangerous?)
This is actually how PEP works. There should be an update from your contact when your contact logs on, and this works consistently in my libpurple-only setup with Prosody. It should definitely not be related to sending messages, unless converse.js is doing something weird (I think).
So I think perhaps lurch needs more triggers for when to ask for the device list.
In theory, no, since I subscribe to PEP updates and so should always be up-to-date. In practice. though, it seems that XMPP stuff is too finicky for such assumptions and you're totally right.
That's why I asked for the raw XMPP of the PEP update, since for the devicelist to be empty after it was previously filled, there should have been an update that deletes everything.
I'm definitely going to think about how to do this better.
I'm new to OMEMO and as of now I'm just testing on my own server with my own user accounts. Since a few weeks, converse.js (web based xmpp client) added omemo support using libsignal-protocol-javascript and I wanted to try that.
While I can send OMEMO messages from converse.js to pidgin/lurch, the other way does not work. Pidgin/lurch cannot send omemo messages to converse.js client, because "the recipients devicelist is empty". I can see pidgin/lurch (daniel@myhost) asking for the devicelist of the other side on converse.js (tester@myhost), but does not get any.
Sending to this client gives the error message in the IM window:
(16:34:48) Even though an encrypted session exists, the recipient's devicelist is empty.The user probably uninstalled OMEMO, so you can add this conversation to the blacklist.
The message is not sent at all.
The pidgin debug log shows:
Now I don't know if this is the fault of converse.js not publishing the devicelist, or of pidgin/lurch making mistakes when fetching the devicelist or of the server (prosody-0.10 with PEP module activated).
The confusing thing is, that connection form Android/Conversation to the same converse.js user is working fine with OMEMO, which would rule out prosody and converse.js. That's why I'm opening this issue here.