gobuffalo / buffalo

Rapid Web Development w/ Go
http://gobuffalo.io
MIT License
8.08k stars 578 forks source link

Can not read request body multiple times #1480

Closed iansmith closed 5 years ago

iansmith commented 5 years ago

Description

If this is a feature request, explain why it should be added. Specific use-cases are best.

This is part feature request and part bug report. In short, slack sends messages to slack apps via HTTP (good) and these are signed with HMAC-SHA256 (good). However, the body is contained the part that is signed (bad). They put a content-type of content_type=application/x-www-form-urlencoded.

Here is a sample body, which is form urlencoded, but a form's data is not sensitive to order and signature algorithms are:

token=rV3iSwpJXizkK66GQLm8NQef&team_id=T8DRVQJ4U&team_domain=slightdrizzle&channel_id=CEAN3G59D&channel_name=apptesting&user_id=UC6FBA788&user_name=iansmith&command=%2Fmaz&text=maz&response_url=https%3A%2F%2Fhooks.slack.com%2Fcommands%2FT8DRVQJ4U%2F493213364772%2FkxNOuEO0QMFgUeVAKf4MH9Sv&trigger_id=494432800071.285879834164.51e506e22646d57273f1b3df4a0201ad

Steps to Reproduce the Problem

Please describe in painful detail what you did (so others can play along with you) to get to this point. This includes things like the exact command(s) you used, or the curl command you used, that sort of thing.

If you want to reproduce the problem, just create a slack app and point a slash command at your machine (or ngrok tunnel). When the http request arrives in buffalo it will have an empty body, because the form data has been extracted.

Expected Behavior

What did you what to happen? Tell us a story. We love to read.

I tried a couple of approaches to try to work around this. I tried using my own middleware (clearing the old first) and I tried going into the gorilla mux layer as well. Neither of these prevented the consumption of the body before it reached my code.

Actual Behavior

In the happiest of happy places what should have happened?

I would say there are three things that could be done to fix it:

  1. Allow a mux to be passed in as part of app setup. It would be up to the person passing it in to prevent collisions in the URL space.

  2. Offer some type of "raw" middleware that offers no processing before the call to the middleware. This would seem to require avoiding or extending gorilla mux since it doesn't seem to offer this.

  3. Always offer a way that the request body can be replayed. This could be part of context or similar. The http.Request object offers a bit of help here with

    GetBody func() (io.ReadCloser, error) // Go 1.8

Info

Please run buffalo info and paste the information below where it says "PASTE_HERE".

``` $ buffalo info ### Buffalo Version v0.13.7 ### App Information Pwd=/Users/iansmith/mazarin.src/mazarin/go/src/github.com/iansmith/mazarin Root=/Users/iansmith/mazarin.src/mazarin/go/src/github.com/iansmith/mazarin GoPath=/Users/iansmith/mazarin.src/mazarin/go Name=mazarin Bin=bin/mazarin PackagePkg=github.com/iansmith/mazarin ActionsPkg=github.com/iansmith/mazarin/actions ModelsPkg=github.com/iansmith/mazarin/models GriftsPkg=github.com/iansmith/mazarin/grifts VCS= WithPop=true WithSQLite=true WithDep=false WithWebpack=false WithYarn=false WithDocker=true WithGrifts=true WithModules=false ### Go Version go version go1.11.2 darwin/amd64 ### Go Env GOARCH="amd64" GOBIN="/Users/iansmith/mazarin.src/mazarin/go/bin" GOCACHE="/Users/iansmith/Library/Caches/go-build" GOEXE="" GOFLAGS="" GOHOSTARCH="amd64" GOHOSTOS="darwin" GOOS="darwin" GOPATH="/Users/iansmith/mazarin.src/mazarin/go" GOPROXY="" GORACE="" GOROOT="/Users/iansmith/mazarin.src/go" GOTMPDIR="" GOTOOLDIR="/Users/iansmith/mazarin.src/go/pkg/tool/darwin_amd64" GCCGO="gccgo" CC="clang" CXX="clang++" CGO_ENABLED="1" GOMOD="" CGO_CFLAGS="-g -O2" CGO_CPPFLAGS="" CGO_CXXFLAGS="-g -O2" CGO_FFLAGS="-g -O2" CGO_LDFLAGS="-g -O2" PKG_CONFIG="pkg-config" GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/w0/fw3qzvvx54b39b790yxq6v100000gn/T/go-build418835592=/tmp/go-build -gno-record-gcc-switches -fno-common" ### Node Version Node Not Found ### NPM Version NPM Not Found ### Yarn Version Yarn Not Found ### PostgreSQL Version PostgreSQL Not Found ### MySQL Version MySQL Not Found ### SQLite Version 3.24.0 2018-06-04 14:10:15 95fbac39baaab1c3a84fdfc82ccb7f42398b2e92f18a2a57bce1d4a713cbaapl ### Dep Version could not find a Gopkg.toml file ### Dep Status could not find a Gopkg.toml file ### go.mod module github.com/iansmith/mazarin ```
sio4 commented 5 years ago

I am not in front of my laptop so I can't test it. But I guess you can get each values (token, team_id, and so on) by bind() to structure (I think this is easist way) or by calling Request() and access raw values. Did you tried them?

iansmith commented 5 years ago

Yes, the Request() isn't really raw. That is the key problem: context.Request().Body returns an empty writer because somebody has already read it and go's request only lets you read the body once under normal circumstances. I can get the values just fine from the parsed out form data, but I have no way to know what order they were sent in (and that matters for the signature).

sio4 commented 5 years ago

As I remeber the raw body is stored as some variable and request logger logging it as is. Isn't it?

markbates commented 5 years ago
  1. Allow a mux to be passed in as part of app setup. It would be up to the person passing it in to prevent collisions in the URL space.
  1. Offer some type of "raw" middleware that offers no processing before the call to the middleware. This would seem to require avoiding or extending gorilla mux since it doesn't seem to offer this.

Always offer a way that the request body can be replayed. This could be part of context or similar. The http.Request object offers a bit of help here with

This is a bug and should be fixed. Context.Request() should return an unmodified request.

iansmith commented 5 years ago

@markbates I tried a couple of those yesterday and found that the result was the same. I will try to create a small test and show you what I saw yesterday.

iansmith commented 5 years ago

On the subject of the Prehandler, I tried it. Here is the code (possibly wrong) that I used:

func App() *buffalo.App {
    if app == nil {
        app = buffalo.New(buffalo.Options{
            Env:         ENV,
            SessionName: "_mazarin_session",
            PreHandlers: []http.Handler{&SlackHandler{}},
        })

// ...

//SlackHandler is used for testing slack integration.
type SlackHandler struct{}

// this produces
// 2018/12/03 15:00:23 Request Body is empty (and size of form data is 11)

func (s *SlackHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
    var buf bytes.Buffer
    n, err := io.Copy(&buf, r.Body)
    if err != nil {
        log.Fatalf("error during body copy: %v", err)
    }
    if n == 0 {
        log.Fatalf("Request Body is empty (and size of form data is %d)\n", len(r.Form))
    }
    HandleSlackSlashCommand(w, r)
}
sio4 commented 5 years ago

I suspected calling ParseForm() in code below, in the App.newContext(),

https://github.com/gobuffalo/buffalo/blob/e362599b8489bd6e0b62e357bdeb044498c04143/context.go#L55-L61

but the first call of ParseForm() was generated by buffalo.MethodOverride(), by calling req.FormValue("_method"). I think those two calls should be checked with deep understanding of full flow of buffalo (that I don't have, sadly :-)

sio4 commented 5 years ago

Oops! mw-csrf.requestCSRFToken also calls PostFormValue() (and it calls ParsForm() finally). It is somewhat big...

sio4 commented 5 years ago

I looked into this issue more deeply today.

Since request.Body is a stream and Request.ParseForm() consumes it without duplication, the original request body should be saved before calling it or similar functions. In Buffalo, the function is called by:

  1. App.newContext() which is called by handler functions include RouteInfo.ServeHTTP(),
  2. MethodOverride() which is called by App.ServeHTTP(),
  3. and defaultErrorHandler().

Since newContext() is called before all middlewares, making a middleware is not a solution for this issue without fixing the context generator. (the code below is that part. IMO anyway, it is not a good idea to copy those values and save them in memory twice.)

https://github.com/gobuffalo/buffalo/blob/e362599b8489bd6e0b62e357bdeb044498c04143/context.go#L55-L61

For PreWares and PreHandlers, they are invoked after MethodOverride() which is calling ParseForm() indirectly in default App.ServeHTTP(). *It means `Pre` also not a chance for it.** So if they should be invoked before buffalo doing nothing, the order of execution should be changed. (It seems to be a reasonable approach for this issue.)

Just for the test, I assigned dummy MethodOverride and remove above part from newContext() and tested with middleware below (inserted before mw-csrf since it also calls ParseForm) so I can access it from any handlers:

func bodySaver(next buffalo.Handler) buffalo.Handler {
    return func(c buffalo.Context) error {
        body, err := ioutil.ReadAll(c.Request().Body)
        if err != nil {
            c.Logger().Errorf("could not read request body: %v", err)
        }
        c.Set("request_body", body)
        c.Request().Body = ioutil.NopCloser(bytes.NewReader(body))
        return next(c)
    }
}

I didn't test but if I add this code block into the custom MethodOverride, without other changes, It should be possible to access the original body but it cannot be saved into the context so real handlers cannot access them. (Pre* approach also has same problem but just doing its own job in that place.)

So, I think, there are some possible approaches:

  1. Make something like buffalo.Request to wrap http.Request and add features including permanent Body. (universal way to give a static access to the original body)
  2. Fix App.ServeHTTP() and use PreHandlers/PreWares per application on users own demand. (details are in charge of users and the Body cannot be saved permanently for normal handlers. Also, it is cannot be used for a specific route and it works as App wide function. Not a good idea.)
  3. Remove Form to Params copying from context generator, remove or move MethodOverride() to safe place, and give a chance to use middleware and storing Body as context value. (maybe breaking change but I think this is the best solution.)
  4. ...or mixing some of above.

How do you think about?

iansmith commented 5 years ago

@markbates I tried and was successful using buffalo's app as a http.Handler in my own mux. thanks!

My guess, although I don't have great proof, is that gorilla mux is doing this consumption of the body when the incoming request is form-encoded data. I am saying this because some simple experiments using only gorilla mux and it exhibited the "body already read" problem.

markbates commented 5 years ago

Great! I was going to suggest that as a work around until v0.14

markbates commented 5 years ago

closed as it is fixed in development.

sio4 commented 5 years ago

Hi @markbates, Is this a temporary fix for the issue? Basically, access to the Body is not needed for most cases and I think this should be implemented as middleware or similar ways for the user who want this.

And current implementation uses packd.virtualFile but it seems that it requires the triple size of the original Body. Even though the request handler just runs during a few second and some more, I worry about memory consumption when it used for file upload or big content.

markbates commented 5 years ago

I’m happy for a better solution; or better to improve the implementation of packd.File.

Essentially though; yes it is. If anyone, anywhere down the line reads it; it’s broken.

We need a re-readable reader for the body and that was the first one I had on hand. On Dec 5, 2018, 10:59 PM -0500, Yonghwan SO notifications@github.com, wrote:

Hi @markbates, Is this a temporary fix for the issue? Basically, access to the Body is not needed for most cases and I think this should be implemented as middleware or similar ways for the user who want this. And current implementation uses packd.virtualFile but it seems that it requires the triple size of the original Body. Even though the request handler just runs during a few second and some more, I worry about memory consumption when it used for file upload or big content. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

markbates commented 5 years ago

@sio4 I've done some work on packd.File today and packd.FileInfo to reduce the overhead and improve efficiency. Can you take a look at master when have a chance? Thanks.

sio4 commented 5 years ago

Hi, @markbates,

Of course, That's my pleasure! I looked into the changes and I think that's good enough! Anyway, I just edited some parts of the code and made a PR. https://github.com/gobuffalo/packd/pull/7

It is not directly related to this issue but I just fixed Seek() to support normal seeking, not limited to go to the starting point, and changed some codes for easy reading, and removing partially duplicated codes. Please check the PR.

By the way, these packages including Genny and Packd are so interesting!