httpwg / http-extensions

HTTP Extensions in progress
https://httpwg.org/http-extensions/
439 stars 146 forks source link

Investigate the use of Digest with PATCH #853

Closed ioggstream closed 4 years ago

ioggstream commented 5 years ago

About PATCH

1- PATCH is not in http 723x 2- PATCH media type should be something like https://tools.ietf.org/html/rfc7396 because servers SHOULD NOT assume PATCH semantics for generic media types that don't define them, such as "application/xml" or "application/json". Doing so will cause interoperability issues, because the semantics of PATCH become specific to that resource, rather than general. 3- it's difficult to use patch + checksums, like @LPardue said

As of now we

1- warn people about PATCH+Digest 2- don't forbid PATCH+Digest

Notes

reschke commented 5 years ago

it's difficult to use patch + checksums

care to elaborate?

LPardue commented 5 years ago

I can't find my original comment. IIRC I felt there was potential for confusion with what a digest header field on a PATCH request actually meant. Is is A) the digest of the PATCH body or B) the intended digest of the resource after being patched or C) something else.

If the intention is B, there is the danger that other requests (i.e. other PATCHes) could modify the representation such that the digest would not match the client's anticipation.

martinthomson commented 5 years ago

It seems fairly clear to me that any Digest header field refers to the included representation.

Now, that doesn't mean that there isn't value in signaling what you expect the digest of the updated resource representation to be after the PATCH is applied. That however, is follow-on work and probably requires a new header field (say Patched-Digest).

ioggstream commented 5 years ago

@LPardue the original comment is in the FAQ https://github.com/httpwg/http-extensions/blob/master/draft-ietf-httpbis-digest-headers.md#faq

a 200 OK response to a PATCH request would contain the digest of the patched item, and the etag of the new object. This behavior - tighly coupled to the application logic - gives the client low probability of guessing the actual outcome of this operation (eg. concurrent changes, ...)

@martinthomson the behavior inherited from RFC 3230 with Range-Requests and other partial representations is to send the digest of a complete representation. See:

iiuc sending a checksum of a partial representation re-opens the door to the issues that lead to the Content-MD5 deprecation.

reschke commented 5 years ago

We need to be clear whether we talk about request or response header fields.

LPardue commented 5 years ago

So this kind of came to a standstill. Trying to pick it up, assertions I make are:

We currently have the following text in the FAQ appendix

How to use Digest with PATCH method?

The PATCH verb brings some complexities (eg. about representation metadata header fields, patch document format, …),

    PATCH entity-headers apply to the patch document and MUST NOT be applied to the target resource, see [RFC5789], Section 2.
    servers shouldn’t assume PATCH semantics for generic media types like “application/json” but should instead use a proper content-type, eg [RFC7396]
    a 200 OK response to a PATCH request would contain the digest of the patched item, and the etag of the new object. This behavior - tighly coupled to the application logic - gives the client low probability of guessing the actual outcome of this operation (eg. concurrent changes, …)

This seems both overly discussive but light on useful information. Therefore I suggest that we scrap the current text in the appendix and possibly add some guidance under a broader "applicability of digest to request payloads" section.

ioggstream commented 5 years ago

requests

imho request Digest, disrespectful of the payload body, should reflect the reassembled resource representation, similarly to range-requests.

response

PATCH does not have a well defined semantic like PUT

A successful PUT of a given representation would suggest that a subsequent GET on that same target resource will result in an equivalent representation being sent in a 200 (OK) response

If it had, I'd suggest that PATCH should return the Digest of the selected representation of the resource independently of the payload body and response status code (eg. 204)

problems

Still, the fact that PATCH's representation metadata are tied to the PATCH and not to the resource, make it impossible for the client to select the representation

reschke commented 5 years ago

No.

A Digest needs to refer to the payload of the message. If it appears in a request, than it's for the payload of the request.

ioggstream commented 5 years ago

If it appears in a request, than it's for the payload of the request.

My understanding is that's not what RFC3230 says, see the instance definition in https://tools.ietf.org/html/rfc3230#section-3.

We aim at obsoleting RFC3230 without modifying the semantic: in that sense, I see PATCH as an "instance manipulation" of the target resource.

reschke commented 5 years ago

OK, can you translate the following text from RFC 3230:

   The instance described by a message might be fully contained in the
   message-body, partially-contained in the message-body, or not at all
   contained in the message-body.  The instance is specified by the
   Request-URI and any cache-validator contained in the message.

into something I can understand? What, for instance, refers the Digest request header field in PUT or POST to?

ioggstream commented 5 years ago

fully ... partially... not at all contained in the message body

see resource representation

See you use digest to convey the representation-data-digest of the selected representation

PUT

PUT, DELETE a representation of the status of the action;

so digest is well defined

PUT with complete representation

See this example

you send a complete representation, and retrieve a complete representation of the resource

PUT without complete representation

If PUT returns 204 it may return a Digest header, provided that the response contains all the required representation metadata.

See #926

POST

POST a representation of the status of, or results obtained from, the action;

We are not providing examples containing the result of the action and we're currently limiting to exchanges reflecting the status of the resource.

We could clarify that in the case of result of the action, Digest value applies to the representation-data.

reschke commented 5 years ago

Yes, but you were arguing for compatiblity with the old spec. To get there, I'd like to understand what it actually defines for PUT/POST/PATCH.

My theory is that it really does not define that, so we are free to define things properly.

ioggstream commented 5 years ago

The old spec does not define behaviors for PUT/POST/PATCH.

I think we should retain the fact that Digest should be independent from "instance manipulations" https://tools.ietf.org/html/rfc3230#section-1

I see PATCHing as an "instance manipulation" because we are not conveying the full representation in the request: do you think that's reasonable?

I will open:

reschke commented 5 years ago

I see PATCHing as an "instance manipulation" because we are not conveying the full representation in the request: do you think that's reasonable?

Note. We shouldn't mix up completely separate things:

The latter is at a different layer of abstraction.

As @martinthomson said, the Digest header field always should refer to the payload being transmitted (both directions).

ioggstream commented 5 years ago

payloads that are used to modify the target resource (POST, PATCH...) The latter is at a different layer of abstraction.

@reschke thousand thanks for your clarification: I tried to address them in #923.