mdn / content

The content behind MDN Web Docs
https://developer.mozilla.org
Other
9.16k stars 22.46k forks source link

Does error 422 indicate non-repeatability? #25239

Closed GrantGryczan closed 2 months ago

GrantGryczan commented 1 year ago

MDN URL

https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/422

What specific section or headline is this issue about?

422 Unprocessable Content

What information was incorrect, unhelpful, or incomplete?

Warning: The client should not repeat this request without modification.

What did you expect to see?

I expected this warning not to be there, but I could be misunderstanding something.

Do you have any supporting links, references, or citations?

RFC 9110 is the relevant specification, and I can't find any information there implying this warning is true.

It says this about error 422:

The 422 (Unprocessable Content) status code indicates that the server understands the content type of the request content (hence a 415 (Unsupported Media Type) status code is inappropriate), and the syntax of the request content is correct, but it was unable to process the contained instructions.

"unable to process the contained instructions" doesn't imply to me that it would never be able to process the contained instructions. This seems to me like it could depend on the current state of the server. So why does MDN say the request should not be repeated?

Do you have anything more you want to share?

No response

MDN metadata

Page report details * Folder: `en-us/web/http/status/422` * MDN URL: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/422 * GitHub URL: https://github.com/mdn/content/blob/main/files/en-us/web/http/status/422/index.md * Last commit: https://github.com/mdn/content/commit/45d4569a5f7c2a4e614b8e950b9e56fa890b9fed * Document last modified: 2023-03-03T04:42:31.000Z
hamishwillee commented 1 year ago

I think it is correct. The name of the error is Unprocessable Content, which indicates the content is somehow invalid. Further, the same spec provides an example that matches this implication:

For example, this status code can be sent if an XML request content contains well-formed (i.e., syntactically correct), but semantically erroneous XML instructions.

i.e.. the server can't process the content because it is "wrong" in some way.

So I think the warning is appropriate.

Does that make sense?

GrantGryczan commented 1 year ago

@hamishwillee I think that's debatable and wouldn't present that in MDN as a certain truth. I think one argument that can be made against this is that "semantically erroneous" doesn't exclude the possibility of the error's semantics depending on the current state of the server.

For example, error 422 is commonly used when there's a conflict with a request to create a resource whose name is already taken. But if the existing resource were to change their name, what would be the problem with resending the request? This is a matter of semantics; it is semantically erroneous to say "Create resource with ID X" when there's already a resource with an ID of X.

I personally would argue error 409 is better for purposes where your request conflicts with the state of the target resource since that is a more specific error which would also cover those use cases, but the fact is that I have seen many use 422 that way, and I wouldn't consider that use unreasonable, only more general. In fact, the very site we're on is one such example: creating a resource (e.g. repository) with a name that's taken on GitHub returns status 422. (Auxiliary note: In case it helps support that this use of error 422 is popular, after writing this paragraph I asked ChatGPT what some use cases for error 422 might be, and 2 of 4 of them were explicitly dependent on server state, 1 of those 2 being name conflicts.)

My point is just that I think it's debatable, as I very often see error 422 used in a way that leaves the request repeatable. And thus it shouldn't necessarily be presented by MDN in a way that can mislead developers (as it initially misled me) into thinking specifications explicitly discourage requests that received error 422 from being repeated. This warning could be especially confusing for people consuming APIs using error 422, where the error does depend on the state of the server and the request actually can be repeated.

hamishwillee commented 1 year ago

@GrantGryczan Thanks for expanding your answer/clarifying.

The fact though is that I am not an expert and I could be completely wrong.

@teoli2003 Can you advise? I think we should definitely provide examples of where this should be used as done by the spec. I'm not sure where the line is drawn for this being affected by server state "you might do something on the server that subsequently allows this to succeed"

teoli2003 commented 1 year ago

My understanding is the same as @hamishwillee. You need to modify the state of the server's content before repeating this request. The phrase "without modification" is technically correct as it covers both a modification of the request and a modification of the server content but is somewhat ambiguous.

What about:

Warning: The client should not repeat the same request without modifying the server's content, as it will lead to the same error; it is not a transient problem.

GrantGryczan commented 1 year ago

Apologies for this post being so long. I wrote every word for a reason, so if your goal is to understand my arguments then I would appreciate it if you could thoroughly read it. But if not, don't worry about it--there are many things more important than this issue.

@hamishwillee:

  • I think using 422 for the name clash would be a misuse of the response because the description and example in the spec are clearly about the content, and I don't think the name/destination itself is "content" as such. So for that case 409 makes a lot more sense.
  • There is a strong implication in the wording and the example that this is not about server-state.

Where is there such an implication in the wording of the definition itself? There are only three conditions the specification provides for this status code in its definition:

Which part of that definition do you think discourages the error from depending on server state or from depending on things other than content? It can understand the request but still be unable to process its instructions for reasons that don't depend on the request's content.

Or maybe you meant strictly the example implies it's not about server state, in which case: the specification never implies that the examples it gives for status codes are the only ways each status code should be used. I thought the same of error 409 as well previously; its second paragraph has a similar form:

For example, if versioning were being used and the representation being PUT included changes to a resource that conflict with those made by an earlier (third-party) request, the origin server might use a 409 response to indicate that it can't complete the request. In this case, the response representation would likely contain information useful for merging the differences based on the revision history.

Because of this, I used to think error 409 is primarily intended for version control. But no, that's just an example. All that's needed for error 409 is a conflict with the state of the resource on the server (hence it being appropriate for name conflicts).

  • Even if 422 is "valid" for a name clash I'm not sure the warning is really wrong in intent "The client should not repeat this request without modification.". We're telling people you need to change something before this will succeed. This isn't a temporary issue with the server. Yes the server might rename the topic so your request can succeed, but it is more likely that you need to change your request.

What's likely depends heavily on the use case; that's just one example. But even so, this warning doesn't imply it's unlikely; it implies it's discouraged. And even if it were necessarily unlikely, it is not necessarily even possible for the client to modify the server's state in a way that would prevent the error. It may be another client's responsibility for example (e.g. a name conflict with a resource the client doesn't control), or it may depend on other transient factors that no client controls. All the definition says is that it's unable to process the request. It gives no assertion about temporary or permanent. And if it can depend on anything, it can certainly depend on time--plenty of things depend on time.

  • There are alternatives for indicating state issues in a server, but none for indicating content issues of this nature.

I could totally give an example of a more complex use case I personally ran into that seemingly has no alternatives to error 422, depends on server state, and is time-sensitive, if you want. I won't provide it unless asked because this comment is already long enough.

  • ChatGPT is not a source of truth for MDN: garbage in = garbage out
  • Github are also capable of doing the wrong thing.

I used those examples to justify that it was popular and thus debatable, not that it was correct. For ChatGPT to use it for 2 of its 4 examples most likely requires that it saw those examples quite often in its training data. But I myself have seen those examples quite often as well (GitHub being one), so if you don't agree with that argument, I could present other examples instead. But my point is merely that this use of error 422 where the status depends on server state is common enough that the statement made on MDN is debatable, and the specs don't seem to specifically exclude the possibility of the request being allowed to be repeated.

I want the takeaway to be that, because I'm able to make these arguments (and I don't think they're very unreasonable--I expect many agree with them even if you and many others don't), the warning is debatable and thus could be misleading or confusing in cases where it's not true in practice due to someone disagreeing with it and using the error code differently, and that's extremely common in this case.

@teoli2003:

@hamishwillee's belief is that this error should not depend on server state, contrary to what you seem to be saying if I understand you correctly. Furthermore, even if you believe this error should depend on server state, that phrasing would definitely not paint the right picture, because it implies the client is somehow expected to modify the server's content, which is not necessarily even possible. Server state is not always something the client receiving this error can control. And if it does depend on a relevantly mutable server state, inherently it is a transient problem.

Supposing all of my arguments are wrong--this error actually isn't transient and doesn't depend on server state, and the warning cannot possibly be confusing or misleading in practice--I think the current phrasing is fine already (but I still would argue that's not the case).

hamishwillee commented 1 year ago

Which part of that definition do you think discourages the error from depending on server state or from depending on things other than content? It can understand the request but still be unable to process its instructions for reasons that don't depend on the request's content.

You're right that the last point "it was unable to process the contained instructions" could imply a server related transient issue. The bit that makes me think that this is not the intent is the example given:

XML request content contains well-formed (i.e., syntactically correct), but semantically erroneous XML instructions.

Spec authors are often pretty good at indicating when they want an error to apply to server state.

I don't want to guess here. @teoli2003 How do/can we find out the intent of the authors? A spec update would be appreciated to either allow or reject this possibility.

GrantGryczan commented 1 year ago

Yeah, but like I said in my last reply, that is just an example, not excluding other possibilities. And even if it did, "semantically erroneous" (as I argued in my first reply) doesn't necessarily exclude the erroneousness of the message from depending on time or server state.

But I do understand where you're coming from, and I do think this topic is debatable, which again is my basis for thinking the warning shouldn't be there, since I don't think it's clear-cut. But if you are able to definitively resolve the debate by prompting an update on the spec, that would be even better.

Also by the way, if you are able to get into contact with spec authors, and they do confirm it shouldn't be dependent on time or server state, I would have some questions of my own about which status code should be used in certain scenarios if not 422, where a request is unable to be processed for reasons depending on time and server state but doesn't seem to me to fall under any of the other status codes.

GrantGryczan commented 1 year ago

Oh, I will also note that error 422 is originally from the WebDAV spec, RFC 4918, where it appears RFC 9110 took their phrasing from. So you may or may want clarification from that spec's authors instead or in addition.

bsmth commented 4 months ago

Hi all, chiming in on this one. This issue highlights the note about requests that they shouldn't be repeated without modification. If you consider that a client may make a well-formed request with no syntactic errors, for illustration:

PUT /post/1/comments HTTP/1.1
{
  "date": "2034-01-01",
  "author": "bsmth",
  "comment_body": "..."
}

Assuming the endpoint is correct, the body is well-formed, the date, author and post_body fields and values are in the correct syntax and format, the server may return 422 Unprocessable Content if a date in the future doesn't make logical sense (in this case for a comment 10 years in the future).

If someone has decided (in 2024) that 422 responses make sense in their use case, I tend to side with keeping the note, as it indicates there's a business logic problem with the client request that other client codes don't describe well enough. This extra context to why the client error occurs is important.

Also, I see the WebDAV spec hasn't seen updates since 2007/2008 so I wouldn't hold my breath on an update there. Regardless, the semantics have propagated past only in WebDAV contexts.

Other examples:

I think the page could do with more information, and possibly examples, but I would keep the note, personally. What do you think?

GrantGryczan commented 4 months ago

I'm a little confused, because you gave an example where the client could repeat the request without modification and get a different result, since error 422 in that case is a matter of the current time (whether the time specified is in the future or not). In fact the difference could only be a matter of seconds rather than years, in which case doing a few retries on the request is very likely to allow it to succeed. So why do you suggest keeping the note which warns against repeating the request, given that's the example you're providing?

bsmth commented 4 months ago

the difference could only be a matter of seconds rather than years, in which case doing a few retries on the request is very likely to allow it to succeed.

This is implementation-based and therefore not guaranteed. The important part is that sending 422 responses is intended to convey to the client that 'the client is trying to do something logically wrong'. Have a look at the other examples linked to see other real-world usage.

This thread mentioned GitHub's API earlier, and this is another validation error use case on rejecting requests to update a pull request branch:

If the expected SHA does not match the pull request's HEAD, you will receive a 422

In this case, repeating the request will not succeed until the HEADs SHA matches what the client is sending. We shouldn't advise that developers retry until that happens. We should be advising that developers check the responses in such cases to see why there are client errors.

GrantGryczan commented 4 months ago

This is implementation-based and therefore not guaranteed.

Yes, which is why I'm suggesting to remove the warning that makes a general recommendation against retrying. It's not guaranteed one way or the other--retrying could very well succeed, and I can think of real-world spec-compliant use cases where it does. So a warning shouldn't imply it isn't implementation-based and that error 422 is necessarily non-repeatable.

We shouldn't advise that developers retry until that happens.

I'm not suggesting there be a warning that says to attempt retrying. I'm suggesting removing the warning that implies this error code shouldn't be used for responses that can succeed after a retry.

I, as someone writing a web server, would see this warning and incorrectly think that I'm doing something wrong by making an endpoint that returns status 422 in a situation where retrying can result in success. I did think that when I read this warning, but because I wanted to verify it, I read the specs and found no evidence for the warning (i.e. my retriable endpoints that use error 422 are fine), which led me to create this issue so others aren't misled like I was.