redpanda-data / connect

Fancy stream processing made operationally mundane
https://docs.redpanda.com/redpanda-connect/about/
8.14k stars 838 forks source link

Cannot decompress HTTP POST gzipped JSON #2427

Open webfrank opened 8 months ago

webfrank commented 8 months ago

Hi, I was trying to send a gzipped JSON to an HTTP server endpoint.

input:
  http_server: 
    path: /{tenant}
    timeout: 30s
    sync_response:
      status: "200"

pipeline:
  processors:
  - switch:
    - check: meta("Content-Encoding") == "gzip"
      processors:
      - decompress:
          algorithm: gzip

Sending a simple JSON file gzipped with: gzip sample.json

I've tried all other algorithms with no success, this is the error:

ERRO Failed to decompress message part: unexpected EOF @service=benthos label="" path=root.pipeline.processors.0.switch.0.processors.0

POST is done from within VSCode REST plugin:

POST http://localhost:4195/demo
Content-Type: application/json
Content-Encoding: gzip

< ./sample.json.gz
Jeffail commented 8 months ago

@webfrank when the content encoding is set to gzip the data is decompressed automatically so you shouldn't need that processor.

webfrank commented 8 months ago

Hi, already tried, this is the result:

INFO Running main config from specified file       @service=benthos benthos_version=4.24.0 path=realtime.yaml
INFO Listening for HTTP requests at: http://0.0.0.0:4195  @service=benthos
INFO Launching a benthos instance, use CTRL+C to close  @service=benthos
rw�esample.json��Mn
                   ��9E�u ��{����ɮM�&��w/�H۴�bG6l/d�7����xs�����uzxy=}�=�a��=��Y��y[�h-��/�>>�?������������/��H���w���o����x��Jx?=����D:���w+fk�7T�5�s!�K�
r|��$�N�����'�P�
&��~XZ*DȶO�^]w�G�Ks8����!P�D���

Message is not decompressed automatically

Jeffail commented 8 months ago

Are you sure you're receiving valid gzipped content?

webfrank commented 8 months ago

Yes, if I save the received stream I can decompress it with gzip -d

Il ven 15 mar 2024, 19:14 Ashley Jeffs @.***> ha scritto:

Are you sure you're receiving valid gzipped content?

— Reply to this email directly, view it on GitHub https://github.com/benthosdev/benthos/issues/2427#issuecomment-2000195198, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABDYWGC5KAIJFJ3YKW3Z7OTYYM3A3AVCNFSM6AAAAABEYKTTESVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMBQGE4TKMJZHA . You are receiving this because you were mentioned.Message ID: @.***>

mihaitodor commented 7 months ago

@webfrank I was able to reproduce it using this extension which I assume is the one you used. If you get Benthos to dump the payload to a file, you'll notice that the HTTP request payload contains an extra new line at the end, which I think is a bug in this extension, or maybe the Golang HTTP libraries don't handle this corner case correctly, but I'm inclined to think it's the former. Try it with curl instead and it should work as expected. I'd consider reporting it as a bug on that extension's GitHub repo: https://github.com/Huachao/vscode-restclient

webfrank commented 7 months ago

Hi, with curl it works but benthos doesn't decompress automatically, I need to call "decompress".

mihaitodor commented 7 months ago

The input doesn't decompress the payload automatically by design. You'll have to add a decompress processor like you had in your original question. A scanner field could be added to the http_server input as an enhancement, but I'm not sure if it really makes sense.