Apicurio / apicurio-registry

An API/Schema registry - stores APIs and Schemas.
https://www.apicur.io/registry/
Apache License 2.0
604 stars 267 forks source link

Apicurios implementation of KafkaSeraializer/Deserializer cant deal with a wrapper type using unions #5437

Open lsegv opened 1 hour ago

lsegv commented 1 hour ago

Examples in repository are using GenericType... which is pointless in java. Most people will want to use specific types.

So lets define a message type which will allow us to send multiple messages to the same topic without doing custom serialization hacks.

@namespace("com.lseg.shared.avro")
protocol prototype {
    // T* are our specific types in different files
    record T1 {
        int count;
    }

    record T2 {
        int count;
    }

    record T3 {
        int count;
    }

    record T4 {
        int count;
    }

    // this is the wrapper/envelope type, it lets us send around any message from anyone to anyone
    record Msg {
        // this is very efficient as it will only take 4 bytes to encode what exact type is in the payload
        union{
            // developer will have to add the exact type here
            T1,
            T2,
            T3,

            // this will also remove all the redundant list/map wrapper types
            array<union{T1,T2,T3}>,
            map<union{T1,T2,T3}>
        } data;
    }
}

Try to send this type via kafka and deserialize it successfully on the other end. Without using generic records, we want specifically to parse instance of Msg so we can figure out what the contents are and handle accordingly.

apicurio-bot[bot] commented 1 hour ago

Thank you for reporting an issue!

Pinging @carlesarnal to respond or triage.

lsegv commented 1 hour ago

I can provide more details but i'm limited to what i can upload to public internet. So ask whatever you'll need to reproduce this (i've created a similar issue before with more context).

I've tested the avros own serialization and it handles this just fine, as soon as i let apicurio kafka library to deal with it it fails to parse it.

Most interesting is that at runtime during debug stage i can see that it have managed to resolve all the types from schema registry but fails to understand that T1 is a known type and fails to parse Msg schema (even though all the data is actually correct) i believe its a namespace resolution issue. The deserializer needs to handle namespaces correctly.