Server-generated ids break idempotence

w3c / activitypub

http://w3c.github.io/activitypub/

Other

1.2k stars 77 forks source link

Server-generated ids break idempotence #336

Open rkaw92 opened 5 years ago

rkaw92 commented 5 years ago

Hi, Section 6 states:

If an Activity is submitted with a value in the id property, servers MUST ignore this and generate a new id for the Activity.

On unreliable networks such as the Internet, a desired property of systems is idempotence (often called idempotency), where retrying a given operation that has succeeded is safe, and gives the same result as if done only once. Incidentally, retrying is also the only way to recover from Web request failures.

The quoted requirement basically tells the server to ignore client-generated IDs, which, in idempotent systems, function as the locator for the previous version of the command. Using the ID, the server can establish whether the operation previously succeeded, or if there is no trace of it, in which case it must be carried out from the start.

Client-generated IDs are a requirement for idempotent command processing. Currently, I cannot see how they can be used with the protocol to guarantee at-most-once processing semantics.

kaniini commented 5 years ago

for what it's worth, Pleroma completely ignores that rule and is working fine.

trwnh commented 3 months ago

One option is to use Idempotency-Key at the HTTP header level https://datatracker.ietf.org/doc/draft-ietf-httpapi-idempotency-key-header/

FWIW Mastodon uses this: https://docs.joinmastodon.org/methods/statuses/#create

evanp commented 3 months ago

One big problem with having the client generate IDs is that there's no enforcement of URL paths in ActivityPub. So, a client may generate an ID that's got completely different routes from what the server supports. For example, AP servers might include random numbers, UUIDs, auto-increment numbers, the username, the date, the activity type, etc. in the URL. That makes it hard for the client to know what kind of ID to generate.

One possiblity is supporting some mechanism to share an URI template used for generating URLs, although there are still issues with uniqueness. However, that might be a good solution.

I think there's a lot of work having the client generate the IDs, when at least for idempotentence the header @trwnh mentions might be an easier lift (and is entirely client-derived). If there are other reasons we should support client-provided IDs, we should discuss. Note that this normative change of a MUST requirement.

silverpill commented 3 months ago

FEP-ae97, which describes client-signed activities, provides the following recommendation:

Contrary to what ActivityPub specification prescribes in section 6. Client to Server Interactions, the server MUST NOT overwrite the ID of activity. Instead of assigning a new ID, the server MUST verify that provided ID has not been used before. If activity ID is an HTTP(S) URI, the server MUST check that its origin is the same as the server's origin. The server MAY put additional constraints on the structure of activity IDs if necessary.

URI templates are a good idea, I'll probably mention them in the FEP.