Open basimons opened 11 months ago
I think this is not supported, at least with Jackson's native Avro read implementation. Apache Avro-lib -backed variant, while slower, might handle default values correctly.
As to how to enable Apache Avro lib backend, I think there are unit tests that do that.
I agree, it'd be good to document this gap.
Thanks for your response.
I tried looking for a unit test, but I couldn't find one. I did however find the ApacheAvroparserImpl
. When I implemented it like this:
try (AvroParser parser =new ApacheAvroFactory(new AvroMapper()).createParser(payload)) {
parser.setSchema(schema);
TreeNode treeNode = parser.readValueAsTree();
System.out.println(treeNode);
};
It does not work unfortunately (as in no default values). Am I doing it correctly or should I also use a different codec?
I made some changes, as of course the code that I showed in my first message does not fully make sense. You cannot not write a value, even if it has a default. So I changed it to this:
String writingSchema = """
{
"type": "record",
"name": "Employee",
"fields": [
{"name": "age", "type": "int"},
{"name": "emails", "type": {"type": "array", "items": "string"}},
{"name": "boss", "type": ["Employee","null"]}
]}
""";
String readingSchema = """
{
"type": "record",
"name": "Employee",
"fields": [
{"name": "name", "type": ["string", "null"], "default" : "bram"},
{"name": "age", "type": "int"},
{"name": "emails", "type": {"type": "array", "items": "string"}},
{"name": "boss", "type": ["Employee","null"]}
]}
""";
String employeeJson = """
{
"age" : 26,
"emails" : ["test@test.com", "test@test.com"],
"boss" : {
"age" : 33,
"emails" : ["test@test.blockbax.com"]
}
}
""";
When I do this, when I read the values, I get the following exception: java.io.IOException: Invalid Union index (26); union only has 2 types
. Which is the same as reported here: https://github.com/FasterXML/jackson-dataformats-binary/issues/164
The only other note I have is that this:
new ApacheAvroFactory(new AvroMapper()).
is wrong way around: it should be
new AvroMapper(new ApacheAvroFactory)
to have correct linking; and then you should be able to create ObjectReader
/ ObjectWriter
through which you can assign schema.
But I suspect that won't change things too much: you should either way have ApacheAvroFactory
that is using Apache Avro lib.
Ah thanks, didn't know that. I tried it, but as you said it did indeed not work.
Whats weird, I even tried decoding it with the apache avro library myself. I just used GenericDatumReader (and all things that come with it), but I would get exactly the same error. This does not make sense right? As I'm sure that what I'm doing is allowed by Avro (adding a default field in a reader schema, that is not in the write schema), as I have done it many times in my Kafka cluster.
Do you happen to know what the difference might be? Do my Kafka clients do anything special for this?
I finally get it. In your kafka cluster it saves the writing schema with it. If you parse it like this:
Schema avroSchema = ((AvroSchema) schema).getAvroSchema();
GenericDatumReader<GenericRecord> objectGenericDatumReader = new GenericDatumReader<>(writingschema, avroSchema);
BinaryDecoder binaryDecoder = DecoderFactory.get().binaryDecoder(payload, null);
GenericRecord read = objectGenericDatumReader.read(null, binaryDecoder);
So with the specific writer schema.
It does work. Normally kafka does it this way for you, but I don't think the AvroMapper has a way do to it with 2 schemas.
@basimons Avro module does indeed allow use of 2 schema (read/write) configuration -- it's been a while so I'll have to see how it was done. I think AvroMapper
has methods to construct Jackson AvroSchema
from 2 separate schemas.
Ah. Close: AvroSchema
has method withReaderSchema(AvroSchema rs)
where you get both schema instances, then call method on "writer schema" (one used on writing records). From ArrayEvolutionTest
:
final AvroSchema srcSchema = MAPPER.schemaFrom(SCHEMA_XY_ARRAY_JSON);
final AvroSchema dstSchema = MAPPER.schemaFrom(SCHEMA_XYZ_ARRAY_JSON);
final AvroSchema xlate = srcSchema.withReaderSchema(dstSchema);
and then you construct ObjectReader
as usual.
Hello,
I encountered something strange while doing some tests with the avro decoding.
Example here, was ran in version 2.16.0:
If you look at this object you see that the default value is not filled. It is just a null, all the other fields are filled just as expected. I tried this with different schemas and not having a union with a null, but just the default, but that would result in a JsonMappingException.
Am I doing something wrong here, or is this not supported? It doesn't say that it does not support default values like it says in the protobuffer one.
Thanks in advance
EDIT: This makes sense that it does not work, as you cannot write a AVRO file with a default without a value for it. I think it should've thrown an error on writing. But the main question is why it doesn't work with a reading schema that has a default, but a writing schema that does have one. See my other question.