redpanda-data / connect

Fancy stream processing made operationally mundane
https://docs.redpanda.com/redpanda-connect/about/
8.16k stars 842 forks source link

Receive UDP packets #875

Open helgeolav opened 3 years ago

helgeolav commented 3 years ago

Hi

I have an issue receiving UDP packets. The problem seems to be with codec, either requiring a newline or some other delimiter to actually create a message. I just need a packet boundary (one packet = one message) and 'codec: all-bytes' does not do that either for me. I am not sure how to proceed on this. Any ideas?

If ok I could make a PR and try either rewrite the code and try fix this. I can also look into adding metadata as mentioned in #327 at the same time.

My config. As UDP sender I can use nc -w 0 -u 127.0.0.1 5000 < myfile.


input:
  socket_server:
    network: udp
    address: 0.0.0.0:5000

pipeline:
  processors: []

output:
  type: stdout```
Jeffail commented 3 years ago

hey @helgeolav, sure, ideally I'd like to maintain the abstraction that the codec type provides, but if it's not possible then we could potentially override behaviour within the socket_server input when the codec is set to something like packets.

helgeolav commented 3 years ago

I have been looking through the code. To receive UDP packets as I want to I can use codec: chunker:2000 and with this each packet up to 2000 bytes will be sent through as I want it. Packets over 2000 bytes will be truncated down to 2000 bytes and the remainder is discarded.

For the other part - getting the source IP and port for a packet is a bit harder as the codec hides the concrete type behind the interface. I need to use ReadFrom instead of Read to also get the Addr out. Would it be acceptable to rewrite some of the codecs to do type assertion upon init? I think that it only is needed with chunker - as for UDP it does not make any sense to use another codec directly.

WoLfulus commented 1 year ago

Just stumbled on this one and I'm not sure how to proceed safely since I expected each packet to be a full message.

The way I got it to "work" is to send a newline in the end, but when there's multiple publishers, if one of them sends a broken message (without \n) it will mess up the next packet since it seems to buffer all packets. It only works for me because I'm sending small JSON (< MTU in size), if it was a binary protocol it would be completely broken because \n could appear anywhere.