Open hjwalt opened 1 year ago
@hjwalt , it will be great if you can work on it. Let me know if you need any help getting started
I'll start on it and let you know if I need any help, either from here or from the community slack!
I'm considering yet another thing that will extend this concept to add dynamic serde, but that might need a KLIP. So the idea for this implementation will be to create a path towards that.
Plan as of now:
SerdeClassLoader
similar to UDF class loader, this will scan from KSQL_KSQL_SERDE_DIR
defaulting to /opt/ksqldb-serdes
in ksqldb-serde
in serde
packageprotobuf_class
format (Translator, Converter, etc) in ksqldb-serde
in protobuf
packageVALUE_PROTOBUF_CLASS
, which is practically just the full class name of the protobuf classIs there anything else I need to take note of @suhas-satish ?
Is your feature request related to a problem? Please describe.
We use protobuf bytes for our Kafka record key and payload without Confluent wire format. With the addition of protobuf_nosr, we can use implicit mapping from schema to table / stream assuming:
Example:
Proto schema:
Implicitly correct protobuf_nosr stream:
The first assumption is acceptable. The remaining assumption causes correctness to depend on uncontrollable implicit mechanism.
To illustrate, with the following schema and stream combination and Kafka payload generated by another producer, the stream will silently fail due to ordering.
Also the following combination will silently fail due to skipped field:
Describe the solution you'd like
This problem can be resolved by inferring field index from protobuf descriptor information if we can supply:
This way the protobuf index of id and some_value can be inferred with field descriptor information from the class.
Describe alternatives you've considered
Use schema registry format (PROTOBUF)
An option, but a separate concern from this feature
Use schema registry url instead of VALUE_PROTOBUF_CLASS with protostuff / protoparser
Schema registry stores protobuf schema as the text file, not the compiled proto descriptor. This cannot be directly used by standard protobuf library for getting field descriptor information
Protoparser: already archived Protostuff: an option, but still not the standard way for protobuf in java
Use other kind of schema registry
An example is stencil. This might work, but I think from the perspective of most people in the community, this is even more non-standard.
Get schema registry to store protobuf descriptors
Maybe? but not the right repository.
Additional context
I, of course, do not know yet how complex the field mapping and descriptor assignment will be based on the data structure and classpath scanning that is internally used by ksqldb. However, if this sounds like something useful for ksqldb I am happy to work on this.