agl / pond

Pond
BSD 3-Clause "New" or "Revised" License
911 stars 109 forks source link

enhancement: ability to delete outbox acknowledgements #134

Open david415 opened 9 years ago

david415 commented 9 years ago

When an outgoing ack has a ratchet mismatch then it will fail to send but remain in the outbox thus blocking the entire outbox queue... preventing any other messages from being sent to any other contact.

Shouldn't we provide a way to delete outgoing ACKs (unmerged acks)?

(I had to write code to remove messages from my outbox in order to get my pond client to send messages again... but normal people would just give up I think.)

Or if an ack fails to send it should be removed from the queue if it results is a key mismatch error... so that it doesn't prevent other outgoing messages from being sent.

burdges commented 9 years ago

Could you post the error messages? Are you sure it's the ratchet and not the group signature scheme? I didn't think the server knows anything about the ratchet.

I've had normal messages block the queue when a contact's state got messed up too, and the server continuously rejected the message, but afaik that's due to the signature scheme. In fact, I'd think all messages rejected by the server go through processMessageSent here : https://github.com/agl/pond/blob/master/client/network.go#L586

In theory, moveContactsMessagesToEndOfQueue addresses these issues, but I'd agree something strange does happen occasionally : https://github.com/agl/pond/blob/master/client/cli.go#L1138 https://github.com/agl/pond/blob/master/client/network.go#L975

We should probably add some sort of retry count to queuedMessage, increment it in processMessageSent, and use c.logEvent to attach an error to the contact.

Also, an unmerged ACK is a message with a zero body length, so a quick fix for the CLI, and GUI with restart, is to add a body like "ACK failed too many times" to any ACKs with too many retries. And if you must edit your pond state file then you could merely add a body to the bad ACK. ref : https://github.com/agl/pond/blob/master/client/network.go#L115 https://github.com/agl/pond/blob/master/client/cli.go#L914

david415 commented 9 years ago

unfortunately i did not keep a copy of the error message but i think it is the group sig not ratchet... i think it said: DELIVERY SIGNATURE INVALID

burdges commented 9 years ago

I'm experiencing a similar issue in messaging a collaborator right now, so I'll try to debug this.

If I attempt to message Ram then all messages remain blocked indefinitely with DELIVERY_SIGNATURE_INVALID until I remove the message to Ram and quite and restart pond.

There is some chance the problem is related to sending an ACK to an old message, but I believe the problem is that moveContactsMessagesToEndOfQueue does not work correctly.

agl commented 9 years ago

DELIVERY_SIGNATURE_INVALID means that something has gone very wrong: either your group member private key is incorrect, the contact has changed theirs somehow, or the server's group public key is corrupt.

Can the contact receive messages from anyone else? If so, then that suggests the former. Did this happen after a revocation by the contact?

leif commented 9 years ago

Apologies for neglecting to report this!

I've seen DELIVERY_SIGNATURE_INVALID happen to three of my contacts. As far as I know, it happened with all of their contacts at once after they did a revocation - they can still send messages, but nobody can send messages to them.

I believe that at least one of the three still has the affected state file and would likely be willing to help figure out what happened. Probably another issue should be opened about this.

As to the feature request in this issue, when the DELIVERY_SIGNATURE_INVALID bug is fixed it won't be necessary, but meanwhile here is a very small patch to display empty (ack) messages in the cli so that they can be deleted: https://github.com/leif/pond/commit/a34813875dae91de3501ee38e914fbc3c872e819

leif commented 9 years ago

Actually, the first person I had this happen with could still send messages to me, but the latest one cannot - I received a message from them but got the "Failed to decrypt message" error.

burdges commented 9 years ago

I've frequently gotten DELIVERY_SIGNATURE_INVALID when a contact deletes someone, or maybe it's when they add someone. It usually just goes away like GENERATION_REVOKED, actually I've at least once gotten GENERATION_REVOKED followed by DELIVERY_SIGNATURE_INVALID.

I've submitted a patch that makes moveContactsMessagesToEndOfQueue log whenever it runs : https://github.com/burdges/pond/commit/5e4110d794ebac4197b70a66c3750295208ffabb

I've reproduced the following test scenario :

I send messages to Leif, Ram Vedam, and David in that order. Lief's message gets sent fine. Ram's message repeatedly generates the following error. And David's message does not send.

 Oct 29 16:02:23: Starting message transmission to pondserver://ICYUHSAYGIXTKYKXSAHIBWEAQCTEF26WUWEPOVC764WYELCJMUPA@jb644zapje5dvgk3.onion
 Oct 29 16:02:29: Moving messages to Ram Vedam to the end of the queue.
 Oct 29 16:02:29: Error from server pondserver://ICYUHSAYGIXTKYKXSAHIBWEAQCTEF26WUWEPOVC764WYELCJMUPA@jb644zapje5dvgk3.onion: error from server: DELIVERY_SIGNATURE_INVALID
 Oct 29 16:02:29: Next network transaction in 2m43.292s seconds

It's not actually that moveContactsMessagesToEndOfQueue does not work though, because even if I abort Ram's message, so that no messages from Ram are in the queue, then David's message still doesn't send. Even if I delete the draft to Ram, David's message still doesn't send. And moveContactsMessagesToEndOfQueue keeps being run for Ram.

What finally fixes it is : Abort all messages to Ram and quit and restart pond. I still cannot send to Ram of course, but any messages to Lief and David shoot off as soon as I restart.

It's as if the head of the queue is not really being stored in the queue somehow. I suspect the thread doing the sending has a local copy of the queue somehow and does not refetch it from the client data structure. Or maybe moveContactsMessagesToEndOfQueue needs to poke some channel?

Also, I wonder if messageSentChan should be poked on DELIVERY_SIGNATURE_INVALID, like with GENERARION_REVOKED? https://github.com/agl/pond/blob/master/client/network.go#L979

burdges commented 9 years ago

I guess there is a separate potentially server side bug if DELIVERY_SIGNATURE_INVALID should never happen.

@agl Did you try to key up with Ram using the ss I sent you both? Maybe this could be related to your systems being partially keyed up while Ram is largely offline? Anyways I've pond messages you his public identity so that you can see what state the server is in.

kargig commented 9 years ago

I am also facing the same problem with a contact. I constantly get "DELIVERY_SIGNATURE_INVALID" trying to pond messages to him. If I try and send messages to others while messages to contact A are still in the queue, all others also get "DELIVERY_SIGNATURE_INVALID". If I delete all queued messages to A, restart pond and try to resend messages to other contacts it works.

burdges commented 9 years ago

I've now fixed the bug in moveContactsMessagesToEndOfQueue in https://github.com/burdges/pond/commit/1433bc3ec3707da256eb8689f7e827cc046d4ec5 so failures with one contact should not prevent another contact's messages from sending. Appears this change already now lives in pull requests https://github.com/agl/pond/pull/132 and https://github.com/agl/pond/pull/133.

I have not figured out why DELIVERY_SIGNATURE_INVALID appeared in the first place, maybe that's a server issue, not sure.

burdges commented 9 years ago

Ram can send to me, but I continue getting DELIVERY_SIGNATURE_INVALID when trying to send to him, so it's not something pond can just recover from. I cannot replicate the error in developer mode yet.

burdges commented 9 years ago

I think my fix addresses the ACK issue : https://github.com/burdges/pond/commit/1433bc3ec3707da256eb8689f7e827cc046d4ec5

We're discussing the server's DELIVERY_SIGNATURE_INVALID issue here now : https://github.com/agl/pond/issues/135

agl commented 9 years ago

To anyone who got hit by the persistent DELIVERY_SIGNATURE_INVALID on incoming messages, please see my message on #135.