redpanda-data / connect

Fancy stream processing made operationally mundane
https://docs.redpanda.com/redpanda-connect/about/
8.09k stars 818 forks source link

Add support for `EmitDefaultValues` in `protobuf` processor #2884

Closed szymonorz closed 5 days ago

szymonorz commented 6 days ago

As of now protobuf processor doesn't have an ability to emit default values from protobuf messages.

Meaning, given schema:

syntax = "proto3";
package prototest;
option go_package = "github.com/szymonorz/genserver";

message Event {
    string id = 1;
    int64 timestamp = 2;
    string name = 3;
    bool approved = 4;
}

if approved was set to false (bool default value) then after converting the event to JSON, that field would be omitted from message.

pipeline:
  processors:
    - protobuf:
        operator: to_json
        message: prototest.Event
        import_paths: [./schema]
{"id":"aaaaa", "timestamp":"1726777881", "name":"Robert"}
{"id":"aaaaa", "timestamp":"1726777881", "name":"Robert", "approved":true}
{"id":"aaaaa", "timestamp":"1726777881", "name":"Robert"}
{"id":"aaaaa", "timestamp":"1726777881", "name":"Robert", "approved":true}
{"id":"aaaaa", "timestamp":"1726777881", "name":"Robert"}
{"id":"aaaaa", "timestamp":"1726777881", "name":"Robert", "approved":true}
{"id":"aaaaa", "timestamp":"1726777881", "name":"Robert"}

google.golang.com/protobuf supports EmitDefaultValues in protojson.MarshallOptions

I think it would be a nice feature to have ability to enable it in Connect. I propose adding emit_default_values field to the processor. It should be disabled by default to not affect already existing pipelines

pipeline:
  processors:
    - protobuf:
        operator: to_json
        message: prototest.Event
        import_paths: [./schema]
        emit_default_values : true
{"id":"aaaaa", "timestamp":"1726777564", "name":"Robert", "approved":false}
{"id":"aaaaa", "timestamp":"1726777564", "name":"Robert", "approved":true}
{"id":"aaaaa", "timestamp":"1726777564", "name":"Robert", "approved":false}
{"id":"aaaaa", "timestamp":"1726777564", "name":"Robert", "approved":false}
{"id":"aaaaa", "timestamp":"1726777564", "name":"Robert", "approved":true}

I'm willing to implement this if you are ok with having this feature

mihaitodor commented 5 days ago

Hey @szymonorz 👋 Thank you for raising this issue! Could you please check if it is a duplicate of #2506? I think there's a PR opened for it here #2507, but I'm not very familiar with protobuf and it would be good to get an extra pair of eyes on that code.

szymonorz commented 5 days ago

Thanks for your reply. Yes, this is a duplicate. I didn't notice that issue and PR beforehand, sorry.

I tested out the PR on my use-case and it does solve the problem and from the looks of it I don't see anything wrong with the code. I'm gonna close this issue due to it being a duplicate