open-telemetry / opentelemetry-collector

OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
4.25k stars 1.41k forks source link

[confighttp] snappy compress, Encode and NewReader inconsistent behavior #10584

Open tomatopunk opened 1 month ago

tomatopunk commented 1 month ago

Hello team, I hope you are having a good day.

Describe the bug

I am coding on the collect-contrib, about on a new component to receive prometheus_remote_write.

I have encountered a potential bug related to inconsistent behavior in Snappy compression when using encode -> NewReader and NewBufferedWriter -> decode.

When we compress or decompress from a stream, or read all of it and then compress or decompress it, we get different results

Steps to reproduce

What did you expect to see?


package main

import (
    "bytes"
    "fmt"
    "github.com/golang/snappy"
    "io"
)

func main() {
    test1()
    test2()
}

func test1() {
    data := []byte("Hello, World!")
    //encode
    compressed := snappy.Encode(nil, data)

    //decode
    reader := bytes.NewReader(compressed)
    sr := snappy.NewReader(reader)
    sb := new(bytes.Buffer)
    _, err := io.Copy(sb, sr)
    if err != nil {
        println("test1 decode error:", err.Error())
        return
    }

    fmt.Printf("test1 string1: %s, 2: %s", string(data), string(sb.Bytes()))
}

func test2() {
    data := []byte("Hello, World!")

    //encode
    var buf bytes.Buffer
    sw := snappy.NewBufferedWriter(&buf)
    _, err := sw.Write(data)
    if err != nil {
        println("encode error")
        return
    }

    //decode
    byteSlice := buf.Bytes()
    reqBuf, err := snappy.Decode(nil, byteSlice)
    if err != nil {
        println("test2 decode error:", err.Error())
        return
    }

    fmt.Printf("test1 string1: %s, 2: %s", string(data), string(reqBuf))

}

What did you see instead?

maybe should be correctly code/decode by snappy

What version did you use?

main branch

What config did you use?

code: https://github.com/open-telemetry/opentelemetry-collector/blob/725e8699bf43c71e1209dccbb444e480bee7bda7/config/confighttp/compression.go#L61

Environment

go 1.22

Additional context

In the case of Prometheus and OpenTelemetry collection, Prometheus uses encode and decode, while OpenTelemetry uses NewReader and NewBufferedWriter. However, I believe that whether you decode/encode in stream or read all and then encode/decode, the same protocol should yield the same results, and there should be no parsing failures.

code: https://github.com/prometheus/prometheus/blob/c173cd57c921f582586fc725ad51124728757533/cmd/promtool/metrics.go#L104

I would like to know how to address this bug. Should we consider skipping Snappy? If any fixes are required, I am willing to help resolve this issue.

BinaryFissionGames commented 1 month ago

This mismatch between the reader/writer and encode/decode is called out in the library: https://pkg.go.dev/github.com/golang/snappy#pkg-overview - it explains there is a "stream" format and a "block" format.

Digging a little deeper, it looks like the stream/framed format should be specified as x-snappy-framed, as described here: https://github.com/google/snappy/blob/main/framing_format.txt - I'd expect snappy then to be exclusively for the block/non-framed format, like Prometheus remote write uses.

jpkrohling commented 1 month ago

Thank you for digging into this, @BinaryFissionGames! I just double-checked, and we are using readers/writers consistently in the Collector code, I believe an OTel Collector as client can compress its data using snappy and a Collector server can read it. I'm closing this issue, but if I'm missing something, let me know and I'll reopen.

reader: https://github.com/open-telemetry/opentelemetry-collector/blob/6fcebdbf64a8c7eb1c2f17501e93a04a8c19b540/config/confighttp/compression.go#L61-L72

writer: https://github.com/open-telemetry/opentelemetry-collector/blob/6fcebdbf64a8c7eb1c2f17501e93a04a8c19b540/config/confighttp/compressor.go#L29-L30

tomatopunk commented 1 month ago

I believe this issue should be reopened.

Actually, we are discussing the need for consistency between streaming and framed encryption/decryption.

For example, stream encryption requires stream decryption, and framed encryption requires framed decryption. However, we are unsure about the correct methods to use for encryption and decryption.

origin:

Package snappy implements the Snappy compression format. It aims for very high speeds and reasonable compression.

There are actually two Snappy formats: block and stream. They are related, but different: trying to decompress block-compressed data as a Snappy stream will fail, and vice versa. The block format is the Decode and Encode functions and the stream format is the Reader and Writer types.

The block format, the more common case, is used when the complete size (the number of bytes) of the original data is known upfront, at the time compression starts. The stream format, also known as the framing format, is for when that isn't always true.

The canonical, C++ implementation is at https://github.com/google/snappy and it only implements the block format.
tomatopunk commented 1 month ago

In the post, I mentioned to use of reader/write and encode/decode. Specifically, if we use newBufferWrited to encry and use decode to decry, we will get an error snappy is incorrent

tomatopunk commented 1 month ago

As I mentioned in this post, a classic example is the otlp-collector and prometheus remote write. Prometheus remote write use snappy.Encode for compress, while the otlp-collector use snappy.NewReader. As a result, I encountered a snappy error, despite both using snappy.

tomatopunk commented 1 month ago

so, I suggest either removing snappy decompress in the otlp-collector or replacing the current model with block format.

jpkrohling commented 1 month ago

Looking at the documentation for that package, we are using what should be called "x-snappy-framed".

I'm not sure it's that easy for us to change from framed to block right now, as it would mean that clients and servers are not compatible anymore. I'd rather do a phased approach, where, once every couple of versions we take one of the following steps:

  1. add "x-snappy-framed" working alongside "snappy" (v0.107.0)
  2. deprecate "snappy" (v0.109.0)
  3. remove "snappy" (framed) (v0.111.0)
  4. add "snappy" (blocked) (v0.113.0)

I believe the risk in breaking clients/servers this way is reduced, and we can then have another snappy support for block format, required by Prometheus remote write. Is it possible to implement framed format in Prometheus remote write? If so, we'd have it working on the next version already, due in a couple of weeks.

tomatopunk commented 1 month ago

Perhaps we could allow the receive component to choose whether to automatically decompress.