Closed michael-huxtable closed 3 years ago
I just want to share my insight as I also experienced this issue recently and it was a real hell to find out what is wrong as 99% messages were consumed/deserialized correctly in my case. It all started with StackOverflowException in the consumer, too.
It seems there's a limitation for the length of the text for the string fields in Apache.Avro 1.10.0 (and 1.10.1 as well) that causes Avro.AvroException : End of stream reached in field XY.
You can find the code snippet to reproduce this issue below. Once I downgraded Apache.Avro to 1.9.2, test below passes as a charm.
It looks like a bug to me, but if anyone thinks it's something else (configuration issue or so), please speak up. It will definitely help at least me/my team and @michael-huxtable 😄
[Test]
public void should_serialize_long_text()
{
ISchemaRegistryClient schemaRegistry = new CachedSchemaRegistryClient(config);
IDeserializer<Topic> avroDeserializer = new AvroDeserializer<Topic>(schemaRegistry).AsSyncOverAsync();
var objectToSerialize = new Topic
{
InfoText = "with more than 256 characters passed here the deserialization should not fail using version 1.10. with more than 256 characters passed here the deserialization should not fail using version 1.10. with more than 256 characters passed here the deserialization should not fail using version 1.10."
};
var avroBytes = new AvroSerializer<Topic>(schemaRegistry).SerializeAsync(objectToSerialize, SerializationContext.Empty).Result;
var result = avroDeserializer.Deserialize(avroBytes, isNull: false, SerializationContext.Empty);
Assert.That(result, Is.Not.Null);
}
I have hit the same issue - likewise downgrading to Confluent.SchemaRegistry.Serdes.Avro 1.5.2 which uses Avro 1.9.2 fixed the problem
The Avro release notes are no help.
Did you discover anything else @marek-vrana ?
Currently there are two open bugs related to our investigation:
let's hope for a soonish fix for the first one at least 🤞
FYI: version 1.10.2 is out 🎉 large strings issue is fixed. https://github.com/apache/avro/releases/tag/release-1.10.2
Just created https://github.com/confluentinc/confluent-kafka-dotnet/pull/1566 to hopefully speed this up. :)
Think this one can be resolved now that 1.6.3 has been released!
I have hit the same issue - likewise downgrading to Confluent.SchemaRegistry.Serdes.Avro 1.5.2 which uses Avro 1.9.2 fixed the problem
The Avro release notes are no help.
Did you discover anything else @marek-vrana ?
For anyone that hits a similar issue to me the above helped me.
we upgraded the avro nuget package from 1.5.2 -> 1.5.3 and then on consumption we started getting the following errors:
Confluent.Kafka.ConsumeException: Local: Value deserialization error
---> Avro.AvroException: End of stream reached in field FieldName
The fix was to downgrade back to 1.5.2
Description
When upgrading to Confluent Kafka 1.5.0 from 1.3.0, for our separate Avro message projects, we upgraded the Avro messages projects to use
Apache.Avro
version 1.10.0 fromConfluent.Kafka.Avro
->1.7.7.7
. At the time I noted that the minimum version of the dependency should be1.9.2
, but went with latest here. This has been fine so far but for this specific message I noticed StackOverflow exceptions in the consumer.For a specific avro type we have:
One thing to note is the strings in the fields will be quite large. I can try and get exact sizes if this helps.
When specifically targeting Apache.Avro 1.10.0 the following happens:
How to reproduce
Should we use the latest Apache.Avro version? I have a lot more complex Avro messages targeting this version and working fine thus far. The only thing I can think of is the large strings?
Checklist
Please provide the following information: