golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
124.12k stars 17.68k forks source link

net/http: distinguish POSTed form values #3630

Closed moraes closed 9 years ago

moraes commented 12 years ago
Request: add a way to access parsed Request body values separately. 

There's currently Request.Form, which merges values from the URL query and the parsed
body. An accessor to the body parameters without query values would be welcome
(Request.FormBodyValues(), maybe?).

Just to keep the request registered. 

Related discussion:
http://groups.google.com/group/golang-nuts/browse_thread/thread/ef76ea0e57a32424
rsc commented 12 years ago

Comment 1:

This has come up before, and I don't think we're going to enable it.
Typically doing this is a mistake.
If you must do it, you already can:
queryForm := url.ParseQuery(r.URL.RawQuery)
r.URL.RawQuery = ""
Then you can use queryForm to get at the query parameters and call
r.FormValue or r.ParseMultipartForm to get at the POST-ed parameters.

Status changed to Unfortunate.

moraes commented 12 years ago

Comment 2:

I'll say this is unfortunate indeed since it is a common and simple distinction in
frameworks from major languages. :)
When I need the distinction I'll use:
// parseForm calls Request.ParseForm() excluding values from the URL query.
//
// It returns an error if the request body was already parsed or if it failed
// to parse.
func parseForm(r *http.Request) error {
    if r.Form != nil {
        return errors.New("Request body was parsed already.")
    }
    tmp := r.URL.RawQuery
    r.URL.RawQuery = ""
    if err := r.ParseForm(); err != nil {
        return err
    }
    r.URL.RawQuery = tmp
    return nil
}
For query values, I simply access r.URL.Query().
rsc commented 12 years ago

Comment 3:

I am aware that many web frameworks distinguish the two, but web
frameworks are pedantic about a lot of things.  Go encourages
simplicity: let the transport take care of the details and let your
app work at a higher level.  The parameter was sent.  Why does the
specific encoding it used matter?
Instead of reimplementing this functionality in your app, why not let
go of the details and just use r.FormValue?
moraes commented 12 years ago

Comment 4:

Honest question, not rhetorical: for an OAuth server (aka provider), would you ignore if
a token request parameter came from the request body or the URL query? I am just
ignoring query values when body parameters are required by the spec. Wrong?
patrickmn commented 12 years ago

Comment 5:

I like this functionality, but FWIW it _is_ easier to exploit something which was just
written "intuitively" (I don't think the workaround is intuitive).
func MyHandler(...) {
        if req.Method("POST") {
                name := req.FormValue("name")
                cc := req.FormValue("cc")
                ...
                sendConfirmationEmail(user, email)
        } else {
                ...
        }
}
-----
"Your friend John has invited you to join YourSite. Go to <a
href="https://yoursite.com/register?sessId=realLookingGarbage&verylongquerystringlalala&someA92581258KJQWTQ9125812512590&email=bwahah%40eliteteam.ro">https://yoursite.com/register<;/a>
to join him today!"
->
...
<form method="POST"> <!-- no action value -->
   <input name="name">...</input>
   <input name="cc">...</input>
   <input name="ccexp">...</input>
   <input name="cvc">...</input>
   <input name="email">...</input>
   <input type="hidden" name="csrftoken" value="A92M1859MF812915N7125M942">...</input>
   <input type="submit"...></input>
</form>
...
*Attacker pays his bills*
Average Joe can't really help himself here. The URL looks normal, it starts with
https://, the padlock is there, etc.
I also am trying to imagine a scenario where you would use both body and request params
interchangeably, but can't think of any.
I don't think the functionality necessarily needs to be split up, but IMO there should
be a "real" way to get params only from the body. What Brad suggested would work:
"Keeping that behavior, it's possible we might one day add accessors for just-body
parameters, but probably not new exported fields."
rsc commented 12 years ago

Comment 6:

At what point does the query string cause a problem in this
hypothetical example?  I don't see it.
patrickmn commented 12 years ago

Comment 7:

Here's a real example (http://play.golang.org/p/ddFnVS6F-_ -- works locally--I assume
play restricts listen, etc):
package main
import (
    "fmt"
    "net/http"
    "strings"
)
func Handler(w http.ResponseWriter, req *http.Request) {
    if req.Method == "POST" {
        email := req.FormValue("email")
        // ...
        fmt.Println("Sending confirmation email to:", email)
    }
}
const DeceitUrl =
"http://localhost:21847/handler?realLookingGarbage&verylongquerystringlalala&someA92581258KJQWTQ9125812512590&email=bwahah%40eliteteam.ro"
func main() {
    http.Handle("/handler", http.HandlerFunc(Handler))
    go http.ListenAndServe(":21847", nil)
    http.DefaultClient.Post(DeceitUrl, "application/x-www-form-urlencoded", strings.NewReader("email=usersrealemail@example.com"))
}
# go run req.go
Sending confirmation email to: bwahah@eliteteam.ro
Assuming that the page handler is doing nothing but checking whether the request type is
POST, this will let a malicious user override arbitrary values in the form via the
querystring by e.g. sending them an email with a "link" that's actually href'ing
something else. This particular example works because the form posts to the same page
(no action value in <form>), so the query string is retained. You could expect
developers to always use "action=/somepage", or users to read the whole querystring even
if it's very long, but this doesn't feel right to me.
Similarly, if a dev isn't actually checking the request method for e.g. an account
deletion handler, one could easily get a user to click on a link that goes to
/confirmdeleteaccount?confirmed=1 (although I think this is a weaker argument since they
_should_ be employing CSRF tokens/referer checking anyway, and similar POST requests are
just slightly harder to forge.)
Maybe giving POST params precedence over GET ones is all that's needed to solve the
first issue, and I think that actually makes sense in general (POSTs being "heavier"
than GET.)
patrickmn commented 12 years ago

Comment 8:

By POST and GET params I of course mean body params having precedence over querystring
ones.
patrickmn commented 12 years ago

Comment 9:

Also, I shouldn't say "assuming they're doing nothing but" -- CSRF tokens, referer
checking and the works wouldn't do a thing against the above unless you apply your
workaround or add a form action.
I'd be happy to take a stab at the precedence if you think it makes sense. Seems like
it's just a matter of changing return vs[0] to return vs[len(vs)-1] in
net/url.Values.Get.
moraes commented 12 years ago

Comment 10:

OpenID 1.1: 
"When a message is sent as a POST, OpenID parameters MUST only be sent 
in, and extracted from, the POST body." 
http://openid.net/specs/openid-authentication-1_1.html 
OAuth 2: 
"The parameters can only be transmitted in the request body and MUST 
NOT be included in the request URI."
http://tools.ietf.org/html/draft-ietf-oauth-v2-26
rsc commented 12 years ago

Comment 11:

Thanks for helping me understand.  This is the part I was missing:
"This particular example works because the form posts to the same page
(no action value in <form>), so the query string is retained."
I agree that POST values should take precedence. I would also be happy
to have a r.PostForm field like r.Form but containing only the posted
values.
Russ

Labels changed: added go1.1.

Status changed to Accepted.

patrickmn commented 12 years ago

Comment 12:

Ok. Taking a stab at it.
patrickmn commented 12 years ago

Comment 13:

CL: http://golang.org/cl/6210067/
lukescott commented 12 years ago

Comment 14:

GET and POST should be separate. Combining the two actually makes things more
complicated. For example, I want to get the GET values and the raw POST body (it's not
URL encoded). By using request.FormValue the post BODY is read - which can only be read
once. Because of this I have to ignore the entire API and parse GET values myself so the
POST body isn't read and cleared before I have a chance to read it.
I would prefer the POST body not was read unless explicitly intended. This could be made
more clear by having:
request.GetValue
request.PostValue
The request.FormValue can work as is (reads either/or) but when a key conflicts between
GET and POST the POST should take precedence over GET.
gopherbot commented 12 years ago

Comment 15 by prencher:

We're dealing with query values, body values and then a number of different methods of
request - GET, POST, PUT etc.
Body values can come from PUT requests too, making the "PostForm" naming incorrect.
BodyForm or similar would be more correct.
As far as the API, I'd like to propose the following set:
QueryValue
QueryValues
BodyValue
BodyValues
FormValue // Contains both query and body, as presently
FormValues
As far as I can tell, providing this set would completely eliminate the need to touch
the raw storage for the query and body data at all, unless you're looking to modify them.
moraes commented 12 years ago

Comment 16:

+1 for "BodyForm". More appropriate.
gopherbot commented 12 years ago

Comment 17 by xiam@astrata.com.mx:

Another way to avoid mixing POST with GET variables is by using
Request.ParseMultipartForm on data sent with enctype="multipart/form-data", in this case
POSTed variables can be pulled from Request.MultipartForm.Value instead of Request.Form.
curl 'http://localhost:8081/test/dump_all?get=here' -F "post=is_it"
{"GET":{"get":"here"},"POST":{"post":"is_it"}}
rsc commented 12 years ago

Comment 18:

This issue was closed by revision abb3c0618b658a41bf91a087f1737412e93ff6d.

Status changed to Fixed.