Open vinicius0197 opened 7 months ago
This looks to me like an issue with the coder being used not updating to reflect the new field. This would make sense to me, as I don't believe beam supports dynamically changing coders.
I'm a bit surprised this has worked cleanly in the past. Can you provide an example of a change that has worked?
What happened?
We have a Dataflow pipeline running Apache Beam 2.48.0. This pipeline consumes data from a Kafka topic using KafkaIO with schema-registry in Avro. Yesterday we've updated the schema for this topic (added a new field) without issues. Today (some 12hrs after the change) we've started receiving errors on our Dataflow pipeline. The stacktrace is below:
Stacktrace
Looking into the
org.apache.avro.UnresolvedUnionException: Not in union ["null","long"]: mytable
line, what I find interesting is that the schema registry looks like this (I'm showing just the last few lines of the schema):The
tags
fields is the new field that was added to the schema registry yesterday. Looks like theAvroCoder
tried encoding the__table
field, which is of typestring
(mytable) to the type of the next field in the schema (which is along
).This is not the first time we've added a new field to a schema in schema-registry, and we didn't have issues before.
Issue Priority
Priority: 2 (default / most bugs should be filed as P2)
Issue Components