ninenines / cowboy

Small, fast, modern HTTP server for Erlang/OTP.
https://ninenines.eu
ISC License
7.29k stars 1.17k forks source link

REST: Ordering of allowed_methods/malformed_request and is_authorized/forbidden #587

Closed PaulSD closed 11 years ago

PaulSD commented 11 years ago

Is there a reason that allowed_methods and malformed_request are called before is_authorized and forbidden?

My thinking is that unauthenticated or unauthorized clients should not be able to discover what methods/requests an application would accept if they were authenticated/authorized. For example, knowing that my application will only accept GET requests might tell an unauthenticated attacker something about the functionality of my application that I don't want them to know.

In fact, I think an argument could be made to move is_authorized/forbidden above known_methods and uri_too_long too, although I feel like those are lower-level checks that are intended more to indicate what the underlying server implementation (rather than the application running on that server) would accept, so perhaps it isn't as important that they be protected.

bullno1 commented 11 years ago

Some resources are owned by some users so GET/HEAD are allowed for all but only a few can PATCH/PUT/DELETE. Same argument about authorization: Anonymous GET should be possible, I think.

essen commented 11 years ago

That's not how HTTP is defined though. In HTTP, when you try accessing a resource you're not authorized to access, you get an unauthorized response (with details as to how you can get access). Clients will generally not attempt to authenticate until they get that unauthorized response.

Hiding methods will not give you any particular security. If the application allows creating user accounts, then creating an account is enough. If you are talking about some admin only stuff, then you probably shouldn't have those public to begin with anyway (you can easily have 2 listeners, one being public, the other allowing admin stuff only for some IPs, and filter according to that; plus you can just stop the admin listener when not needed).

As for known_methods, it's for the server implementation, yes, so in most cases you don't need to modify it.

PaulSD commented 11 years ago

I think you both misinterpreted what I was saying... I'm not arguing to change the way that authentication/authorization is done, I'm simply arguing to have as much application logic as possible occur after the authentication/authorization check.

My use case is a private web application used by a very large organization. Unauthenticated users should have no access to my system at all. Authenticated users may or may not be allowed to access my system based on their role in the organization, and they may be accessing my system from anywhere (possibly one of our hundreds of offices, or possibly from the user's home, so I cannot simply filter by IP address or disable my listener when it is not needed). My application does not allow creating user accounts.

The most obvious way for me to implement such an application as a Cowboy REST Handler is to implement an is_authorized() method that forces the user to authenticate, then implement a forbidden() method that check's the user's role in the organization and blocks access if they don't have an appropriate role. I would probably also want to implement a method_not_allowed() method that limits methods to those that my application supports, and maybe also a malformed_request() method that limits access based on other limitations within my application.

My concern is that my systems are regularly scanned by external (unauthenticated) entities looking for vulnerabilities in my systems, and I want to limit their visibility into the system as much as possible. However, if I implemented my application as described in the previous paragraph, unauthenticated users would be able to determine things like whether or not this particular REST application implements support for POST requests. You are right that limiting access to this information doesn't add much tangible security - unauthenticated users won't be able to use the system either way. However, even small tidbits of information like this can give an attacker valuable hints about the system they are attacking. Most modern computer security systems are based on the least privilege principle for this very reason, and exposing information such as the supported HTTP Methods to unauthenticated users violates that principle.

So, I guess my question is: Is there a good reason to allow unauthenticated users to determine which HTTP methods an authenticated REST application supports? (Are there any use cases which would require that?) If not, would you accept a push request that made the method_not_allowed and malformed_request checks happen after the is_authorized and forbidden checks?

dvv commented 11 years ago

I believe you should simply use a custom middleware inserted before cowboy_router to mimick such behavior and restrict access before the control is passed to rest handler.

essen commented 11 years ago

I don't think I misinterpreted you. REST is defined in the HTTP/1.1 RFC. If you find the order of the callbacks isn't adequate, you need to send a patch to the RFC, not here. REST implies auto discovery, the exact thing you want to prevent, so the answer you're seeking isn't in REST.

You should probably do like dvv say, perform the authentication before getting to the handler, by changing the routes to just the login form or something when the user failed to authenticate.

PaulSD commented 11 years ago

REST is not defined in the HTTP/1.1 RFC, it is defined in Roy Fielding's Dissertation (http://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm). Regardless, I don't see any language in either the HTTP/1.1 RFC or Roy Fielding's Dissertation that suggests an order for these callbacks. However, if you can point me at the specific text that you referenced, I will be happy to bring the issue up with the author(s) of that text.

My proposal does not prevent auto-discovery in general (authenticated users would still be able to do auto-discovery, and unauthenticated users could do auto-discovery if the app does not require authentication), it simply prevents an unauthenticated user from performing auto-discovery on an application that requires authentication. I don't see how this would violate anything in the HTTP or REST specifications.

PaulSD commented 11 years ago

Just for a point of reference, the Restlet framework for Java (http://restlet.org) is implemented such that authentication is checked (via a Guard) before the HTTP Method is checked (based on the methods implemented in the target Resource), which is equivalent behavior to what I'm proposing here.

essen commented 11 years ago

REST is the model that he created before and while working on HTTP/1.1. HTTP/1.1 describes everything you need to implement a RESTful application. You can find him saying just that here for example: http://webcache.googleusercontent.com/search?q=cache:Pn_lclbnE6EJ:tech.groups.yahoo.com/group/rest-discuss/message/6757+&cd=1&hl=en&ct=clnk

If a client accesses a resource on your service and you say it doesn't exist only because it's not authenticated, you are lying to the client and the client will think there's no resource at all. A proper service asks for authentication. But you can only work out how to fetch a resource if you can get the list of allowed methods. So you have to provide it, the client can't guess that otherwise. And the client being able to consume your entire service with no previous knowledge is one of the key point of REST.

So yeah not gonna happen. The middleware solution should work quite well for you though.

PaulSD commented 11 years ago

Ok, I'm fine with the middleware solution in my particular case. However, your reply makes me a bit concerned, because we are clearly not on the same page here, and that makes me think that one of us is misunderstanding something, and I'd like to reconcile that... If an unauthenticated client access a resource on my service, I do NOT want to say that it doesn't exist (I'm not sure how you got the impression that I wanted that) ... I want to ask for authentication (with a 401 response). However, as currently implemented, if an unauthenticated client access a resource on my service, they might get a 405 or 400 response instead of a 401. That is my only concern. I would like to see unauthenticated users always getting a 401, and only authenticated users getting a 405 or 400 if that is appropriate. Why would that change break auto-discovery, or anything else in the HTTP or REST specs?

essen commented 11 years ago

401 means accessing this resource using this method requires authentication. If the client then authenticates and you give it a 405 because the method is wrong, you can probably see the problem. And at that point what an automated client should do is undefined. So bad HTTP and bad REST there.

PaulSD commented 11 years ago

I guess that is the real issue here ... I don't really see the problem with that. I would expect the behavior of an automated client when it gets a 405 after having authenticating (because it got a 401) should be the same as it's behavior when it just gets a 405 (without getting a 401 first). Why would it be any different? It seems perfectly valid to me to tell a client "I won't even tell you if this method is supported until after you have authenticated", then tell them "sorry, it isn't supported" after they have authenticated, and I don't see how that is either bad HTTP or bad REST (although it is certainly better security).

It looks to me like we have a fundamental difference in opinion here, so perhaps I just need to agree to disagree on this. Thanks for being open to discussion.

essen commented 11 years ago

A client should never get a 405 though. 405 means the client is doing it wrong, and the only thing it can do about it is error out. 401 it can do something, 404 may be due to an outdated link on the server, but most others are the client's fault.

PaulSD commented 11 years ago

Right, that is kind of my point. A valid client doing valid things should never get a 405, so whether or not it gets a 401 before it gets a 405 doesn't really matter, because that should never happen anyway. However, a clever attacker (who is intentionally trying to do things wrong) might gain valuable information by getting a 405 instead of a 401 for certain methods, so I think there is value in always returning a 401 for unauthenticated users, and only returning a 405 after the user has authenticated.

essen commented 11 years ago

You missed mine. 401 means "you're doing it right, but you need to auth to use this method". 405 means "you can't use this method". You got it backward.

I'm not questioning whether it's good to hide everything in your situation, just explaining why the REST code is like this. It's just following HTTP.

PaulSD commented 11 years ago

Aha! You've found the root of our difference in opinion...

My read of the definition of 401 (http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) doesn't imply that "you're doing it right" ... only that "you need to auth before continuing". I seems to me that it is perfectly valid per HTTP to send a "you're doing it wrong" error only after sending a 401.

essen commented 11 years ago

"The request requires user authentication."

How do you know a request requires user authentication without first looking at it? You need to know what the action is, what resource is targeted, inspect the headers and so on. That's what all the callbacks before is_authorized do.

Your case of "hide everything because it's a private app" isn't really covered by the semantics of HTTP (aka REST). You can still do it if you want of course, but you're losing out a little.

PaulSD commented 11 years ago

True, at the time is_authorized() runs you may need to be able to read those request values to determine if authentication is needed, but you don't necessarily need to know if the application will accept those values.

It might be good to ensure that those values are syntactically valid per the HTTP standard before letting is_authorized try to read them, which is one of the reasons I think it probably makes sense to keep known_methods before is_authorized.

However, I would expect that typical authentication rules would be something like "require authentication for all POST requests" or "require authentication for all requests under /login/" or "require authentication for all requests that don't contain a session cookie", and you can perform all of those tests without knowing whether the application will actually accept DELETE requests, which is why I think it makes sense to call allowed_methods after is_authorized.

In fact, note that resource_exists currently runs after is_authorized, so if a client asks for a bogus resource, we will not return a 404 until after authentication occurs. Based on your argument, shouldn't resource_exists be called before is_authorized so that we validate that the target resource would be accepted by the application before attempting authentication?

essen commented 11 years ago

We do send 404 before authentication, if routing failed. But if we found the resource handler we can't know whether the resource exists and more importantly what to do about it until we unrolled the whole thing (because the resource not existing doesn't mean automatically 404, it depends on a number of other factors). Authentication may be required to find whether a resource existed previously and get the link to the new URL for it, for example.

PaulSD commented 11 years ago

Agreed.

So my argument is basically just that the behavior of allowed_methods should be consistent with the behavior of resource_exists. If we can wait until after authentication to make the final determination as to whether the resource exists, why can't we wait until after authentication to make the final determination as to whether the method is allowed?

essen commented 11 years ago

You need to know the method in order to send a request. If you don't know the method yet (entry point), you can use OPTIONS on the URL to get the list of methods and other info about how to access the resource. Then you can make the request.

It goes more or less like this:

You can't just skip the request line (or the required host header, that would have to be handled in malformed if Cowboy didn't do it before for you (said host header is a proper part of the "request line" in SPDY and HTTP/2.0 also)) to decide if accessing this resource is allowed, you have to make sense out of the request first. This is what all the callbacks before is_authorized are about. Then you can make an informed decision about whether to allow access to the resource or not.

Anyway I'm not going to repeat myself endlessly, sorry, I'll stop there. What you want to do is an edge case that's not defined in the spec, but that you can still do outside cowboy_rest if you want.

PaulSD commented 11 years ago

OK, thanks for discussing it with me.