Closed rutkowskij closed 4 weeks ago
Oh, nevermind it works with
public fun <T> Avro.decodeFromGenericData( writerSchema: Schema, deserializer: DeserializationStrategy<T>, value: Any?, ): T
Yes, please read the migration guide, you have it here: https://github.com/avro-kotlin/avro4k/blob/main/Migrating-from-v1.md#generic-data-serialization
If you think something is missing, please open a PR to add this migrafop example :rocket:
I had to do some digging in both versions to find a solution. Let me know if it's better than what I suggested https://github.com/avro-kotlin/avro4k/pull/247
Oh, I just get it. You want to deserialize generic data from bytes, which is not possible for the moment. What is the purpose of using avro4k then ? Are you using avro4k with some kotlin classes declared with @Serializable
? Or are you only encoding/decoding generic data ?
I am using a previous version of avro4k and @Serializable. I would like to update avro4k to version 2.0.0 but I have a lot of data saved as AvroEncodeFormat.Binary and need to maintain backward compatibility
This is just a raw pure avro serialization. Using generic data or specific classes will be serialized exactly the same.
Hello, I haven't heard about you since 2 weeks. Can you explain a bit you needs ?
need to maintain backward compatibility
As said, whatever the runtime type of your data (generic record or data class), if the content and the schema are the same, the binary representation will be the exact same. Do you need to deserialize generic content, which means you aren't able to know the schema by advance ? Or you know the content so you can write a dedicated kotlin data class ? The final thought is that if you only deal with generic records, then avro4k may not be the absolute solution, as you won't use any kotlin feature, so I would advise you to use the standard apache library.
I'm currently implementing the support of GenericRecord
, GenericFixed
and GenericEnumSymbol
, but it takes time to make it cleanly
Thank you for your patience, I was on vacation hence the late reply.
I used the previous version of avro4k in one project where both AvroEncodeFormat.Data
(@Serializable
with a embedded schema) and AvroEncodeFormat.Binary
were used as standard, where due to the large number of small objects the schema was saved in a separate file. I was looking for a way to read both formats in parrallel after updating to the newest version. There is no problem with Data
, and for Binary
I found a way which I added to the migration instructions in PR247. I can merge it If you think that it may be useful for others (and it seems that it may be because previously there was a possibility to use the Binary
format with externally delivered schema). From my perspective, as I found the solution to the problem, there is no need to do anything more here.
No problem. I think it's a must to allow handling generic data as we do not always have the corresponding classes. And if working with both generic and specific records needs to use both avro4k and apache avro library, then it's better to unify the API for this kind of usage. By the way, in my company, we also have this mixed needs.
I'm currently working on that solution, with simply using the Any
type which will trigger generic encoding or decoding. There is still the logical types to handle during decoding and I'll submit the PR. That's why I've not merged your PR as there is ongoing work to support this use case.
Hello back, I'm ready to merge and release. But there is still a gap to fill: logical types.
Do you need to decode LocalDateTime, LocalDate or stuff like that? Or you just use common ints, strings, maps, etc ?
After reading many times your messages, I'm really sorry for misunderstanding your initial request: just read previously written binary avro with v2... I've commented accordingly your pr #247 , there is nothing to do on the avro4k side.
// Previously
Avro.default.openInputStream(serializer) { decodeFormat = AvroDecodeFormat.Binary(schema) }
.from(data).use { avroInputStream -> return avroInputStream.nextOrThrow() }
// Now
val inputStream = ByteArrayInputStream(data)
while (inputStream.remaining() > 0) {
// If the writer schema corresponds to the specified type
val element = Avro.decodeFromStream<MyType>(inputStream)
// If the writer schema does not correspond to the specified type
val element = Avro.decodeFromStream<MyType>(writerSchema, inputStream)
// With explicit writer schema and serializer
val element = Avro.decodeFromStream(writerSchema, serializer, inputStream)
}
EDIT: added the example in this ticket in case of people facing the same issue.
In previous version there was a AvroEncodeFormat.Binary. Using that you could store data without schema and it was possible to provide an external schema on read. I tried to read previously saved data using AvroSingleObject and the provided schema, but the magic numbers do not match, so it won't work. Have I missed something because I can't find a way to continue supporting previously saved data using the new version of avro4k. Thank you for your great work!
avro4k 1.10.1: