grafana / k6

A modern load testing tool, using Go and JavaScript - https://k6.io
GNU Affero General Public License v3.0
25.67k stars 1.26k forks source link

Sending binary request using binary data from HTTP response body doesn't work as expected #748

Closed sherrman closed 6 years ago

sherrman commented 6 years ago

Using a binary response body from an http.get in a new request body doesn't work. The new request body is corrupted.

response = http.get('http://localhost/binary');
http.post('http://localhost/compare', response.body);

In the above example the compare endpoint won't receive the same binary data as was returned by the binary endpoint. I've created a sample Go http server that serves files and has a compare endpoint. The readme shows the validation of the Go server along with the Python SimpleHTTPServer.

The k6 file also reads a binary file from disk and demonstrates that works against the compare endpoint.

https://github.com/sherrman/k6_binarybug

na-- commented 6 years ago

Thanks for reporting this and creating the test case! I'm currently investigating what's happening and my current hypothesis is that it's something UTF related, since when we're not using binary the uploaded file is exactly twice the size of the original. Potentially something goja (the JS runtime) does, since IIRC JS uses UTF-16 internally.

na-- commented 6 years ago

Here's some sample code that's not connected with k6 that I think demonstrates the issue:

package main

import (
    "fmt"

    "github.com/davecgh/go-spew/spew"
    "github.com/dop251/goja"
)

func dump(info string, thing interface{}) {
    fmt.Printf("%s: %s\n", info, spew.Sdump(thing))
}

func main() {
    b := []byte{'a', 'b', 'c', '\xe6', '\x97', '\xa5', '\x00', '\xee', '\xff'}
    dump("Original byte slice", b)

    s := string(b)
    dump("Byte slice converted to string", s)
    dump("String converted back to byte slice", []byte(b))

    rt := goja.New()
    rt.Set("testBytes", b)
    rt.Set("testStr", s)

    rtBytes := rt.Get("testBytes")
    rtString := rt.Get("testStr")

    dump("Bytes object from runtime", rtBytes)
    dump("Bytes export from runtime", rtBytes.Export())
    dump("Bytes export from runtime converted to string from go", string(rtBytes.Export().([]byte)))
    dump("Bytes to string from runtime", rtBytes.String())

    dump("String from runtime", rtString)
    dump("String export from runtime", rtString.Export())
    dump("String to string from runtime", rtString.String())
}

It would output something like this:

Original byte slice: ([]uint8) (len=9 cap=9) {
 00000000  61 62 63 e6 97 a5 00 ee  ff                       |abc......|
}

Byte slice converted to string: (string) (len=9) "abc日\x00\xee\xff"

String converted back to byte slice: ([]uint8) (len=9 cap=9) {
 00000000  61 62 63 e6 97 a5 00 ee  ff                       |abc......|
}

Bytes object from runtime: (*goja.Object)(0xc420140480)(97,98,99,230,151,165,0,238,255)

Bytes export from runtime: ([]uint8) (len=9 cap=9) {
 00000000  61 62 63 e6 97 a5 00 ee  ff                       |abc......|
}

Bytes export from runtime converted to string from go: (string) (len=9) "abc日\x00\xee\xff"

Bytes to string from runtime: (string) (len=30) "97,98,99,230,151,165,0,238,255"

String from runtime: (goja.unicodeString) (len=7 cap=7) abc日��

String export from runtime: (string) (len=13) "abc日\x00��"

String to string from runtime: (string) (len=13) "abc日\x00��"

Something strange is going on with goja's utf16 strings. I don't understand why the String() method (last line of the output) returns a mangled (len=13) copy of the data, given that it supposedly decodes it to a go string...

na-- commented 6 years ago

Ah, the issue probably originates when we convert binary (i.e. invalid utf8) "stings" to utf16, not sure we can actually get the original binary value back...

na-- commented 6 years ago

I spent some more time investigating this and there are some bad news and some good news...

The bad news is that there's no current workaround for the bug, basically once a binary response is treated as a utf-16 string, the information is lost. As an example, if I have create a "string" with all possible byte values:

b := make([]byte, 256)
for i := 0; i <= 255; i++ {
    b[i] = byte(i)
}
runtime := goja.New()
runtime.Set("testStr", string(b))

and I pass it to a goja runtime, where I dump the data in it by running a script like this:

for (var i = 0; i < testStr.length; i++) {
    buffer.push(testStr.charCodeAt(i));
}

I get the values from 0 to 127 fine, but the rest are encoded as 65533, which is basically a replacement character that substitutes for corrupt UTF sequences... So no recovering the original data from that...

But the good news is that I think we can solve the issue with a relatively simple fix. We can add a new HTTP request responseType option that specifies how to receive the response body, similar to the browser XMLHttpRequest.responseType option. By default it would be text, but if users supply binary, we'll return the original Go []byte value. Of course we'd have to change the HTTPResponse.Body type from string to interface{}, but there shouldn't be any issues with that.

There may even be some benefits, for example we can return null when users have enabled the new discardResponseBody option (we use an empty string now, which could be somewhat confusing). We might even rename the new discardResponseBody option to instead be responseType=none, so we don't have 2 separate options.

sherrman commented 6 years ago

That's awesome. I appreciate you looking into this quickly.

I like that solution. 👍