Open adjivas opened 2 years ago
Hi @adjivas 👋🏼
Initially Avrora was designed to work with Record
schemas which allows you to have nested schemas of complex types. But there was no reason to register a primary type (also can't find that definition in Avro specs) yet.
Probably that type can be registered as a part of another Record
schema.
Hi @Strech, sorry for the late reply. There you can found a official list and a documented example of Primitive Types from the Avro specs:
{"type": "string"}
Primitive Types exists to the side of Complex Types.
Sorry for a long pause, I will take a look @adjivas and drop a message of what we can do about it
I've re-read the documentation (also check the erlavro
source code). And the specification says https://avro.apache.org/docs/1.11.1/specification/#primitive-types
Primitive types have no specified attributes. Primitive type names are also defined type names. Thus, for example, the schema “string” is equivalent to:
{"type": "string"}
Then I checked the erlavro and their parsing mechanism
iex(1)> :avro_json_decoder.decode_schema(~s({"type":"string","name":"MyString"}), allow_bad_references: true)
{:avro_primitive_type, "string", []}
And the result is exactly as stated in the specification, primitive types can't have any attributes thus I don't see any in the output.
Here is another answer on that topic: https://stackoverflow.com/questions/66210730/aliases-for-primitive-types-in-avro
TL;DR You can't reference primitive type by alias (our new name perse)
If you have further question, feel free to drop them here
It is true that a primitive type cannot have a name, but unnamed types can be used in registries, the clearest example is a union (see https://docs.confluent.io/platform/current/schema-registry/fundamentals/serdes-develop/serdes-avro.html#multiple-event-types-in-the-same-topic).
If a registry has a schema that, instead of being a record, is a union (that is, in JSON, an array of other types, typically records), and Avrora is asked to decode a message marked with that id, then it will fail to transform the downloaded JSON into a valid schema to decode, instead returning {:error, :unnamed_type}
.
@Strech do you agree that this should be opened back?
I might give it a try in that case.
@rewritten could you please provide an example schema you've mentioned? Maybe it's indeed an issue. It also could be that it's just a case so rare/broken that it makes no sense to put effort.
Sorry I did not notice the response (not my main work account).
Another very common case is decoding message keys. Usually they are strings or numbers, but still avro-encoded in the message payload. In that case the schema will be just "string"
.
In case of unions, the schema is an array:
[
{"type": "record", "name": "foo", "fields": [{"name": "foo_field", "type": "string"}]},
{"type": "record", "name": "bar", "fields": [{"name": "bar_field", "type": "string"}]}
]
Currently, only schemas that are JSON objects with a "name" field at their root are supported (i.e., records, enums, and surprisingly fixed
s), instead unions and basic types fail.
Thanks for the additional details. I still need to wrap my head around it, but it feels like an issue now. It would be cool to have a basic failing example, like - this is the schema, here is it in the file, here is how I encode/decode, here is an error.
Because then I can iterate over it until we have a working solution. I will come to this issue right after decoding the logical types issue.
I have wrapped up a stand-alone script that will show the issue - it starts by showing three types of schema and how they all work with simple decoders from :erlavro
. Then it starts Avrora and populates the memory registry with the schema for a record, showing that it has the same behavior. It finally tries to do the same with a union and with a basic type, without succeding.
https://gist.github.com/rewritten/2533573332d1de1e4b568def9c757c42
Let me know if I can help more.
I cannot provide an example that includes an actual Confluent registry (for obvious reasons), you will have to trust me in that sense.
Docs:
Examples:
["int", "string"]
{
"schema": "[\"int\",\"string\"]",
"schemaType": "AVRO",
"references": []
}
and with a custom type reference
["int", "io.Payment"]
{
"schema": "[\"int\",\"io.Payment\"]",
"schemaType": "AVRO",
"references": [
{
"name": "io.Payment",
"subject": "io.Payment",
"version": 1
}
]
}
The reference name (anchor) above is 1:1 with the schema name (also the subject), but it could be different.
These examples require Avrora to:
references
new field in Schema Registry)As a bonus, Avrora could fix that by the controlled registration before the schema
{
"type": "record",
"namespace": "io.confluent.examples.avro",
"name": "AllTypes",
"fields": [
{
"name": "oneof_type",
"type": [
"io.confluent.examples.avro.Customer",
"io.confluent.examples.avro.Product",
"io.confluent.examples.avro.Order"
]
}
]
}
This extra level of indirection allows automatic registration of the top-level Avro schema to work properly. However, unlike Protobuf, with Avro, the referenced schemas still need to be registered manually beforehand, as the Avro object does not have the necessary information to allow referenced schemas to be automatically registered.
Hello,
Do is it possible to register a primary type' schema with Avrora?
A primary type schema is by example that:
When I try to save it with the
register_schema_by_name
function:This error happen: