Closed biasedbit closed 9 years ago
404
seems inappropriate because there really is a resource there.
e.g
If I do a POST
to /files/my
and get returned a link to the resource /files/12345
then that resource does exist and you can make HEAD
requests etc, it's just the server (probably) can't return a suitable representation if a GET
request is made to that resource.
Off my head 406 - Not Acceptable
could, in some situations, be suitable but it seems a little far fetched for the general use case (I need to read the spec some more)
405 - Method Not Allowed
could be sensible if we go down the simple route of saying a resource is not GET
-able until completely uploaded, but this feels a little blunt and inelegant.
409 - Conflict
is my current favourite as long as we all agree making the request represents a client side failure of some kind and not an issue with the server..
I see 409 Conflict
as misleading in this case. To me it conveys the idea that you're trying to put something on the server that is somehow not acceptable given the current content. By the RFC it's perfectly legal to return that on a GET; it just feels wrong to me (stress on feel and me).
Going through the list of 4xx error responses, even if we consider some additions like the WebDAV 4xx codes, I can't really identify one that strikes me as being perfect for this case.
404 seems inappropriate because there really is a resource there.
Given it's not accessible, would a 403 work?
I'm just trying to go for a sensible default, hence the 404; it's probably the simplest — e.g.: query the database for file with id xkcd
and complete
flag set to true
, return 404 if result is nil. Each server will always be able to implement its own behavior.
As you mentioned, 403 - Forbidden
seems the most appropriate response here if we only allow GET
s for the entire file to be successful for completed uploads. It allows for a response body where we can delegate to individual implementations how much information/reasoning they give.
Something that needs consideration is the situations where a GET
request is made for a subsection of the entire file (using range
or content-range
headers for example) which is a very common scenario (streaming videos, resuming large file downloads)
404
ing on getting the entire file would imply that range requests would also fail, where in actual fact some of them could succed. If the range requested is not completely uploaded we can return a 416 - Requested Range Not Satisfiable
I suppose this could be left as option for implementers, if you want to allow partial downloads - support 403
or whatever. If you don't then just supporting a 404
is fine (although it seems lame)
As I mentioned on Hacker News, I like 416 Requested Range Not Satisfiable
for both HEAD and GET, because technically the request could be satisfied if a range was specified that only included segments of the file that had already been uploaded.
While returning 416 in response to a request without a Range isn't defined, it's not a huge stretch to think of such a request as including an implied Range of everything (i.e. something like bytes=0-
).
The only problem is how to let the client know what ranges are already available. Content-Range would work if you only allowed partial uploads to occurr sequentially, but if you want to support parallel uploads of different parts of the file at the same time, you're going to need a new header.
This rabbit hole keeps getting deeper and deeper...
To support returning multiple non-sequential byte ranges the server would need to support returning something like multipart message
but after reading some of this http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.16 it appears thats only valid if the client explicitly asks for multipart byte/part response and I can't see another mechanism to return multiple byte ranges without the client asking for them.
I'm almost tempted to say 404
and stuff it, but thats a cop out. 400
feels semantically most correct because whilst making a HEAD
request is valid, making a GET
request is not until the full contents of the file is uploaded but the spec's state it should only be sent in response to 'malformed syntax' which is not correct. This leaves 403
as my current winner unless the server can't be bothered to give a useful message, then 404
is fine.
416
feels like it's stretching the symantics too much and is a little non-obvious behaviour.
I'd really like another new status code for this sort of situation. i.e resource not ready or something similar.
I think a 206 Partial Content
is appropriate here, potentially with a multipart/byteranges
Content-Type.
Reference: http://tools.ietf.org/html/draft-ietf-httpbis-p5-range-22#section-4.1
In the case of a file that has been created with no bytes yet transferred, a 204 No Content
might be appropriate, assuming the file upload HTTP calls might complete at a later point in time.
400
is misleading. Conveys the idea that the client has something wrong in his request, which is not true.
Consider a regular non-resuming upload: 404
is perfectly valid — if the upload completed, it'll be there, if it didn't, it won't. In that light, I don't feel like 404
is lame or a cop out. GET
will typically be used by a reader role (HEAD
, POST
and PUT
from a creator role); for a reader, the file does not exist until it is ready for consumption, i.e. complete.
I stand by the 404
. It allows by far the simplest server implementation — which was the whole point of this discussion. 403
's also good.
I'd really like another new status code for this sort of situation. i.e resource not ready or something similar.
There's always the option of using one of the above with a different (non-standard) status message...
@kevinswiber on the 206, from the RFC:
The request MUST have included a Range header field (section 14.35) indicating the desired range, and MAY have included an If-Range header field (section 14.27) to make the request conditional.
Going against a SHOULD may be acceptable but we definitely shouldn't go against MUST directives.
@brunodecarvalho Ah, you're absolutely right. For GET requests with a Range header... and a file that's only partially uploaded, a 206 is appropriate for a valid range.
With an invalid Range header on a GET, the right response is 416 Range Not Satisfiable
.
I think the rules of 403 Forbidden
apply here when there is no Range header on a GET request and a file is only partially uploaded. The rule states that servers can show a 403 if they want the client to know the request is being actively refused; otherwise show a 404.
@kevinswiber @brunodecarvalho
I agree that 400
is too brute force and implying the wrong thing.
I'm quite happy with 403
or falling back to 404
depending on implementation, but i'd like to just check that we aren't chucking out 405 - Method Not Allowed
without good consideration.
One of the cons with 405
is it's uncertain how permanent that response is treated as being. i.e. should you expect a 405
response for a resource to always be 405
, or is it like 404
or 5xx
whereby at some point in the future the resource will be retrievable. As I write this it feels like it could be interchangeable with 403
as there maybe times when the server doesn't want to return any other acceptable methods.
Thoughts?
@sandfox
Re: 405 - I think the method is allowed, but the correct server state doesn't exist to serve the response.
Reading the spec again, it seems 403 is not the right way to go. RFC 2616 explicitly states that a client making a request which returns a 403 response SHOULD NOT attempt that request again.
Unfortunately, there's no 2xx Pending
status code. (This would be great for APIs, too.)
In the Web API world, we would likely return a 303 See Other
that points to a representation communicating the pending status. Alternatively, API authors might respond with a 200 and explain a "pending" status in the response body.
Thinking deeper, I believe my latest recommendation would be a 503 Service Unavailable
. With 503, a Retry-After
header can be included. I think this is exactly what we want. "The server can't serve this to you at the moment, but try again soon."
@kevinswiber the only downside to 503
is that somewhow suggests that it is at fault which isn't really true.
The closest thing to a Pending
status is 202 - Accepted
which I think you could argue would be a valid but a complete abuse as it would cause no end of confusion for people not experienced with the protocol, end users (joe bloggs on his pc) and many software clients that would think it meant everything has really worked when it really hasn't.
I'm swinging behind a 404
for generic GET
requests and allowing for servers to optionally support GET
requests with range
headers and whatnot along with appropriate response (as mentioned above). My argument being that if you have knowledge of the protocol, you know you can make a HEAD
request without needing to be told this by the server, If you don't know (because you a user trying to view a picture or whatever) then in the absence of a response saying come back later (which can be put in a 404
anyway) you only care about knowing it's there in a complete state, if it's not it may as well be completely non-existent to you.
TL:DR
something like this
MUST
return a 404 for resources that aren't fully uploaded,
OPTIONAL
/SHOULD
may return partial sections (or 416
if appropriate) for clients making requests with content-range
headers
thoughts?
@sandfox
I agree with your comments on 503.
The only issue is using the protocol to communicate the resource is expected to be available later. A Retry-After header is considered acceptable on a 503 response and is the only way to hack in that protocol-level communication (with the current spec, of course).
Using a 404, the protocol does not communicate that the resource will be available soon, so that information should be included in the response body. That might be the best compromise.
Some other thoughts...
WebDAV (of which I am no expert) seems to have a Status-URI
header. That seems like it would be pretty convenient for clients to check the status of the file before requesting it again. I believe it's often used in a response of 102 Processing
which seems an awful lot like 202 Accepted
, but I believe the semantics limit it to long-running state change requests (e.g., MOVE, COPY).
I've looked for a 3xx status code that temporarily redirects to a status of the resource, but I'm not sure any of the existing status codes really fit. This would be ideal, in my opinion, and would communicate, "This resource is not yet available. You are being redirected to a URI that communicates more information regarding this resource's availability. When the resource is ready to be retrieved, subsequent requests to this URI will return the actual resource." I've needed this myself more than once.
It's almost unbelievable that this is such a hard nut to crack. Ah, well. I still :heart: HTTP. ;)
@kevinswiber Only downer with retry-after
is the server has no way to know when it should expect the resource to be ready because it's relying on external agents that have an indeterminate amount of time within which it may complete uploading the resource.
With webdav I feel alot of clients aren't going to understand it's semantics very well (either in terms of software, or humans). (Also - webdav feels like an attempt to hack/smash OS filesystem calls into HTTP)
I agree, another 3xx
like you describe would be super useful.
HTTPs vagueness is it's downfall sometimes...
I'm having an about turn on this completely.
In the interest of simplicity can we just leave this up the implementation to do whatever they feel like? It's not really coreto the problem of file uploading and isn't going to affect the the ability to upload files at all. The more opinionated we get about things that aren't essential the harder/less likely we make it for others to implement the protocol.
If people really want this sort of thing, it could go into an extension of the protocol, in fact there could many extensions depending on desired behaviour and problem domain.
Following on from this, is it worth sticking something in the protocol to explicitly state that the behaviour is left undefined? (I'm probably going to smash up a quick PR right now thinking about it)
In the interest of simplicity can we just leave this up the implementation to do whatever they feel like? It's not really coreto the problem of file uploading and isn't going to affect the the ability to upload files at all.
I agree with @sandfox. tus is a protocol to upload and not to download files and this behaviour should be defined for each implementation.
Edge case. Don't really have a strong opinion on whether it should simply return 404 Not Found or give some sort of indication that an upload is under way, like 416 Requested Range Not Satisfiable. For the latter, the RFC says servers "SHOULD" respond with 416 if a Range header was sent in the request but it doesn't say anything about it not being allowed or recommended otherwise.
Probably best to just leave a small note on the protocol draft stating that default behavior is to report 404 until the upload has been completed but server implementations are free to roll in their own behavior as they deem fit.