carwyn / veillance

In Progress
BSD 3-Clause "New" or "Revised" License
1 stars 0 forks source link

Nil Pointer Deference with gzip/deflate Content #6

Open carwyn opened 7 years ago

carwyn commented 7 years ago

http://www.sueddeutsche.de

Failed to gzip decode: gzip: invalid headerpanic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x288 pc=0x5a9557]

goroutine 25 [running]:
compress/gzip.(*Reader).Read(0x0, 0xc42187d000, 0x1000, 0x1000, 0x411ff2, 0xc4211a6270, 0xd0)
    /usr/lib/golang/src/compress/gzip/gunzip.go:247 +0x37
golang.org/x/net/html.readAtLeastOneByte(0xbcc940, 0x0, 0xc42187d000, 0x1000, 0x1000, 0x7f479c0385c0, 0x0, 0x17f479ebaa000)
    /home/carwyn/src/golang.org/x/net/html/token.go:299 +0x60
golang.org/x/net/html.(*Tokenizer).readByte(0xc4211a6270, 0xc420102770)
    /home/carwyn/src/golang.org/x/net/html/token.go:273 +0x270
golang.org/x/net/html.(*Tokenizer).Next(0xc4211a6270, 0x68)
    /home/carwyn/src/golang.org/x/net/html/token.go:980 +0xef
golang.org/x/net/html.(*parser).parse(0xc42188a000, 0xc420102770, 0xc4211a6270)
    /home/carwyn/src/golang.org/x/net/html/parse.go:2001 +0xb3
golang.org/x/net/html.Parse(0xbcc940, 0x0, 0x7f18e7, 0x19, 0xc42004bcd8)
    /home/carwyn/src/golang.org/x/net/html/parse.go:2026 +0xdd
main.(*httpReader).run(0xc421474400, 0xc421188ce0)
    /home/carwyn/src/github.com/carwyn/veillance/interceptor/main.go:209 +0x11c2
created by main.(*tcpStreamFactory).New
    /home/carwyn/src/github.com/carwyn/veillance/interceptor/main.go:295 +0xb9e
carwyn commented 7 years ago

http://stackoverflow.com/questions/13130341/reading-gzipped-http-response-in-go

carwyn commented 7 years ago

This is a bug, if the content type is deflate it shouldn't be trying gzip it should use zlib. Note thought that there are some servers that are broken an use raw deflate format.

From: http://www.gzip.org/zlib/zlib_faq.html#faq38

"gzip" is the gzip format, and "deflate" is the zlib format. They should probably have called the second one "zlib" instead to avoid confusion with the raw deflate compressed data format. While the HTTP 1.1 RFC 2616 correctly points to the zlib specification in RFC 1950 for the "deflate" transfer encoding, there have been reports of servers and browsers that incorrectly produce or expect raw deflate data per the deflate specficiation in RFC 1951, most notably Microsoft.

carwyn commented 7 years ago

https://github.com/golang/go/issues/10377

carwyn commented 7 years ago

Neither the Go or Python stdlib url/http handlers deal with the deflate debacle for you. The Python Requests lib does though and it really is a case of try one, fail, try the other (i.e. peek at the bytes).

The python Requests lib implementation is here

And this is a very useful test page http://carsten.codimi.de/gzip.yaws/

carwyn commented 7 years ago

What the HTTP 1.1 standard has to say about this: