ArroyoSystems / arroyo

Distributed stream processing engine in Rust
https://arroyo.dev
Apache License 2.0
3.81k stars 222 forks source link

Support for avro union types #660

Open kvedes opened 5 months ago

kvedes commented 5 months ago

Hi

I'm trying to use Arroyo on a Kafka topic from Confluent. I can successfully set up the connection and get the Avro schema which is stored in the Schema Registry. However, when I try to create a pipeline my source shows as JSON instead of Avro. I suspect this is due to my Avro schema containing Union types. I cannot find any documentation related to union types, so I suppose it might not be supported. Will this be coming the with the Protobuf support? - Protobuf has similar semantics just called oneof. Otherwise I would like to request it.

P.S. I'm brand new to Arroyo

timonviola commented 5 months ago

Hey, did you check this part of the docs: https://doc.arroyo.dev/connectors/confluent#avro

Just to understand you are seeing the expected behavior (quoting the docs):

For Avro, there are some features that cannot be converted to SQL types:

Is this what you are referring to "source shows as JSON"?

kvedes commented 5 months ago

Hi @timonviola

Thanks, yes this is exactly what I'm seeing. Was hoping to see support for general union types, but doesn't sound like that would be on the roadmap.

kvedes commented 1 month ago

@mwylde Just saw that arroyo 0.12 has full support for protobuf, does this include the "oneof" type in protobuf? - If so supporting Avro Union types should require a very similar implementation.