LemmyNet / lemmy

🐀 A link aggregator and forum for the fediverse
https://join-lemmy.org
GNU Affero General Public License v3.0
12.96k stars 861 forks source link

[Bug]: Comment requests are not idempotent #4735

Open Nohus opened 1 month ago

Nohus commented 1 month ago

Requirements

Summary

Duplicate comment requests result in multiple copies of the comment being posted. The requests should be idempotent, so that sending the same request multiple times does not result in multiple comments. If a Lemmy client does not get a response due to a network issue, the user will be told the comment wasn't posted and will retry. If the original request did in fact succeed, this will result in a duplicate comment being posted.

Steps to Reproduce

  1. Send the same comment request multiple times.
  2. Multiple comments are posted, instead of one.

Technical Details

Reproduced in the Boost for Lemmy Android app.

Version

BE: 0.19.3

Lemmy Instance URL

No response

Nutomic commented 1 month ago

How can we tell that its the same request? The comment form doesnt have any sort of unique identifier. I suppose we could check for existing comments with identical content, post_id and parent_id, but that might give some false positives.

Nohus commented 1 month ago

I would think this would require adding a request_id/comment_id field that the client includes by sending a random value, and the backend ignores duplicate requests with the same id. The client would then change the id if it's a different comment, or keep it the same if it's retrying the same comment.

dessalines commented 1 month ago

There's already a comment.id field that's returned after the comment has been created, IE CommentResponse.comment_view.comment.id

Anyways this problem would affect not just comment, but any create operation, from posts, private messages, likes, etc. The back-end can't know about requests that failed, and shouldn't be storing extra "client-side" ids just to check for dupes.

If a Lemmy client does not get a response due to a network issue, the user will be told the comment wasn't posted and will retry. If the original request did in fact succeed, this will result in a duplicate comment being posted.

The real culprit here is how clients are incorrectly handling failed requests. Issues need to be opened up on those app's repos for their failure logic.

The only other way I could see this being fixed, is to move ID generation to the client side and use UUIDs or something, which isn't a good idea.

Nohus commented 1 month ago

here's already a comment.id field that's returned after the comment has been created

Sorry if I'm missing something, but that does not seem relevant for this issue? If I client does not receive a response, it doesn't matter what the response contains.

The real culprit here is how clients are incorrectly handling failed requests.

What would be the correct behavior in your opinion, when a client sends a request and does not receive a response? It's impossible to tell for the client if the request actually succeeded or not. I'd gladly open an issue in client projects but I don't see any incorrect behavior on their side, correct me if I'm wrong.

This is a real issue that got me banned from an instance. I posted a comment while on a train with a spotty mobile connection, the request timed out, and my client correctly told me the request timed out and offered to retry, which I did a few times. Hours later I discovered all the timed out requests actually worked and I got banned for spamming. I'm sure I'm not unique in this.

It's a common problem that's being solved in a lot of APIs, here are a few examples: https://api-docs.deliveroo.com/docs/signature-idempotent-requests https://docs.stripe.com/api/idempotent_requests https://shopify.dev/docs/api/usage/idempotent-requests https://developer.squareup.com/docs/build-basics/common-api-patterns/idempotency

The description from one of the above links sounds reasonable to me:

To perform an idempotent request, provide an Idempotency-Key: header in the request.

An idempotency key is any unique client-generated value with sufficient entropy to avoid collisions. We recommend using V4 UUIDs as your keys.

For idempotent requests, the API saves the status code and body of the response of the initial request and, when the same idempotency key is used in another request, it replays the saved response.

Anyways this problem would affect not just comment, but any create operation, from posts, private messages, likes, etc.

That's right, ideally the idempotency header would be supported by all of them.

dessalines commented 1 month ago

Enforcing uniqueness for client-side generated UUIDs could have some benefits, but adding these for every create operation would be a lot of work. I'll re-open if you or anyone else wants to work on it tho.

dullbananas commented 1 month ago

The server could prevent creating a comment with the same content, parent comment, and creator as an existing comment, unless the check is disabled using an optional api parameter

dessalines commented 1 month ago

If we had something like that, It'd be nice if it could be made as a DB constraint.

Nutomic commented 1 month ago

Reopening this because its definitely a real issue.

A better way to solve it might be through a middleware, with an HTTP header containing the request id. That would work similar to the existing session middleware which handles logins. Then it would automatically work for all requests and wouldnt need any database changes. The request ids can be stored in memory.

Nothing4You commented 1 month ago

The request ids can be stored in memory

that is not reliable when running more than one lemmy process

MV-GH commented 1 month ago

Depends on the load balancing method. (If based on IP) It can be pretty reliable. Could be worthy tradeoff (fast + simple) . All though you could easily persist this as [RID + SID+ Resource type] Where RID is request id, SID session id+ Resource Type (Post/Comment/Dm). You can auto delete entries that are older than 1 hour as they don't matter anymore.