Open hazelnut-99 opened 4 months ago
Agreed, this would be a great feature. It's a bit tricky because the schema is needed for planning, so this would add a dependency on schema registry as part of SQL planning. The schema might also change, which means that the same query might plan today but fail tomorrow. There also wouldn't be feedback for the user as to what the schema is. I think these issues are surmountable, but will require some design work.
When creating a new source connection through the web UI and selecting Avro as the data format with Confluent Schema Registry as the schema type, users can omit specifying the schema, as it is automatically loaded from the Confluent Schema Registry.
However, when defining a source using DDL within a pipeline, it currently requires explicit schema definition. For instance, the following DDL statement:
leads to an error when subsequently trying to query the table:
Error:
Schema error: No field named my_field.
It would be nice if ad-hoc DDLs inside pipeline definition could support automatic schema retrieval from the Confluent Schema Registry, similar to the functionality available in the web UI.