Open ghost opened 5 years ago
I haven't had a chance to look closely, but suspect this is an issue with the avro library. thanks for reporting.
I created a console application that produces and consumes using the schema provided in the issue. I used the same client library version, dotnet version, and run the app in Windows. Everything works fine and I think it is not a problem with Avro library. You can run the sample app easily:
https://github.com/Mousavi310/confluent-kafka-dotnet-issues/tree/master/Issue1034
You just need to run following command (in Issue1034 directory):
dotnet run
The sample code uses AutoOffsetReset.Earliest
and is located in KafkaHostedService
file. I didn't use Confluent.Kafka.Avro
NuGet and as change log has noted, Confluent.Kafka.Avro
is renamed to Confluent.SchemaRegistry.Serdes
.
It is completely unclear to me how to replace Confluent.Kafka.Avro with Confluent.SchemaRegistry.Serdes. There is a mismatch, when trying to assign value
deserializer.
Argument 1: cannot convert from 'Confluent.SchemaRegistry.Serdes.AvroDeserializer<*>' to 'Confluent.Kafka.IDeserializer<*>'
Confluent.SchemaRegistry.Serdes
is for use with API v1.0 and above only. Confluent.Kafka.Avro
is for use with the 0.11.x API. We highly recommend making the switch to the new API as 0.11.x is now a long way behind in terms of bug fixes / robustness / features.
Thanks for the information!
What I do not understand is the versioning between the two available sets.
Does ConsumerBuilder
come from 0.11.x
? Is there a replacement?
I am experimenting now with .NET core and Kafka.
What I am trying to do now is something like:
.SetValueDeserializer(new Confluent.SchemaRegistry.Serdes.AvroDeserializer<OrderEvent<Order>>(schemaConfig))
The method SetValueDeserializer
expects Confluent.Kafka.IDeserializer
. Maybe I am doing something wrong, but I am unable to find up-to-date documentation and examples (especially about the transition from 0.11.x to 1.0.0).
Hello @mhowlett We are also facing this problem with all latest version libraries.
Keywords like event in namespace is causing deserialization problem.
Hi @mhowlett Is there any update on this issue? I am facing similar issue when there is c# keyword (event) in the namespace. I am using latest version of libraries.
Hi @codebased & @vkhose - I'm interested that you both report the same namespace issue, that is: using a keyword like "event" in the schema namespace. This is the first I have heard that other parties have experienced the same issue. As you can see from the above chain, @Mousavi310 confirmed in a response to me, that confluent was UNABLE to reproduce the issue I originally submitted. I therefore assumed it must have been something my end - I never had time to further investigate (EG: I was wondering if our Kafka installation version was something to look into, since that was the only other variable - we agreed on our DotNetCore and Confluent client versions when he tried to reproduce the issue). Due to other pressures, I made no further investigations, and my solution was to give up on using the keyword "event" in our Kafka schema's namespace. Sorry this is not more helpful - all I would suggest is you both provide the similar details I provided in this issue: Confluent folks will at least have additional data/sample to review in trying to identify this issue.
thanks @paulctl . I am using following libraries for writing my consumer - Confluent.Kafka v 1.4.3 Confluent.SchemaRegistry v 1.4.3 Confluent.SchemaRegistry.Serdes.Avro v 1.4.3
I am using avrogen tool (v 1.9.0.0) to generate c# files using avro schema. When my files are generated it has namespace like - au.com.example.company.@event. As "event" is a c# keyword it wraps is around @.
Deserilizer is configured as - .SetValueDeserializer(new AvroDeserializer
when I try to consume events with this setup it is giving me exactly same error as mentioned above.
P.S. This happens only when I have an array of complex objects in my message
This issue is very easily reproducible by modifying one of Deserialization unit test case and provide a complex object with array of another complex type as child element. Both Parent and child class with namespace @event.
Sample code for test case and avro class https://github.com/vkhose/samplecode/tree/master
Looks like same util methods are used to generate code for specific record and read record during deserialization. https://github.com/apache/avro/blob/c0094b5bb1abb79304ce42a56cc115186370d407/lang/csharp/src/apache/main/CodeGen/CodeGenUtil.cs#L99
if (ReservedKeywords.Contains(names[i])) builder.Append(At); Appending At(@) while generating code makes sense but it should not appended while deserializing. Deserialization fails as c# runtime does not expects At(@) in string value provided to create class instance. https://github.com/apache/avro/blob/master/lang/csharp/src/apache/main/Specific/ObjectCreator.cs#L165
Type.GetType(string) string parameter passed should be without "@" character in namespace name.
I created a console application that produces and consumes using the schema provided in the issue. I used the same client library version, dotnet version, and run the app in Windows. Everything works fine and I think it is not a problem with Avro library. You can run the sample app easily:
https://github.com/Mousavi310/confluent-kafka-dotnet-issues/tree/master/Issue1034
You just need to run following command (in Issue1034 directory):
dotnet run
The sample code uses
AutoOffsetReset.Earliest
and is located inKafkaHostedService
file. I didn't useConfluent.Kafka.Avro
NuGet and as change log has noted,Confluent.Kafka.Avro
is renamed toConfluent.SchemaRegistry.Serdes
.
Your example works fine. But if you update the nuget packages to latest v1.5.0, you start getting the "Local: Key deserialization error" upon consume operation. I have tested this in your example code.
Same issue for 1.7.0. Is there any plan to fix this defect?
This is a bug in the Apache.Avro project. I would recommend using protobuf or json which both have reliable implementations in .net. Or you could try this alternative Avro implementation: https://github.com/ch-robinson/dotnet-avro. We do not link to this library because it doesn't have a community, but it was built out of frustration with the quality of the Apache.Avro implementation, so there's a good chance it is better.
Thanks @mhowlett . I created a jira ticket on Apache.Avro long time back but doesn't look like it is being fixed. I had a workaround for my implementation last time. But I will try your suggestions too.
Hi @vkhose, can you please confirm the following is the issue raised in Apache.Avro - https://issues.apache.org/jira/browse/AVRO-2888 Just to be able to keep an eye on it. Thanks
Hi @vkhose, can you please confirm the following is the issue raised in Apache.Avro - https://issues.apache.org/jira/browse/AVRO-2888 Just to be able to keep an eye on it. Thanks
Yes. This is the one.
Thanks @mhowlett . I created a jira ticket on Apache.Avro long time back but doesn't look like it is being fixed. I had a workaround for my implementation last time. But I will try your suggestions too.
Can you describe your workaround implementation, please?
@razvanreb - the workaround was not very good and involved manually updating autogenerated classes. I would not recommend to go for any workaround as this issue is now fixed in apache.avro library (v 1.11.0) . More details here - https://issues.apache.org/jira/browse/AVRO-2888 and https://issues.apache.org/jira/browse/AVRO-3075 apache.avro version update is pending release for this repo (already updated in master). It will be available with Confluent.SchemaRegistry.Serdes.Avro soon.
The problem is caused by this code in Confluent.SchemaRegistry.Serdes.Avro/SpecificDeserializerImpl.cs, line 146:
return datumReader.Read(default(T), new BinaryDecoder(stream));
Even the Deserializer class knows exactly what type it needs to deserialize to, it always passes default(T), which is always null for .NET object types to datumReader. This causes the DatumReader to create new instance using the Namespace and Class Name from the schema.
The solution is to use Activator.CreateInstance
BTW, the Azure SDK that also uses Avro library, has this implemented correctly.
@mhowlett Let me know if you want me to create a PR for this code change.
@sergemat - yep, sounds good. can you include a test case? we can target 1.9.1, which will probably be coming out relatively soon.
@mhowlett OK. Please review PR #1847
How is this still an issue years later? I'm getting the same compiler error today:
Argument 1: cannot convert from 'Confluent.SchemaRegistry.Serdes.AvroDeserializer<*>' to 'Confluent.Kafka.IDeserializer<*>'
Description
When creating a schema (which is registered into SchemaRegistry) with an array of sub-schema, and using a C# key word in the schema namespace, we are unable to deserialize the Kafka message using AvroDeserializer(...)
The producer uses AvroSerializer(...) successfully to publish message.
When consuming, using Confluent.SchemaRegistry.Serdes.AvroDeserializer we receive exception:
Confluent.Kafka.ConsumeException: Local: Value deserialization error ---> Avro.AvroException: Unable to find type com.company.example.@event.SubObject in all loaded assemblies in field SubObjects.
How to reproduce
Create a schema, ensuring the namespace contains the word "event" (a C# key word). Schema example: { "type": "record", "name": "NewConstructionAddressEvent", "namespace": "com.company.sub.event", "doc": "@author: Smith, @description: Avro Schema for an address", "fields": [ { "name": "eventId", "type": { "type": "string", "avro.java.string": "String" }, "doc": "@required: true, @description: unique id (UUID version 4 and variant 2) for an event, @examples: d15f36fe-ab1e-4d5c-9a04-a1827ac0c330" }, { "name": "eventType", "type": { "type": "string", "avro.java.string": "String" }, "doc": "@required: true, @description: operation type for event, @examples: created|updated|deleted" }, { "name": "constructionAddressId", "type": { "type": "string", "avro.java.string": "String" }, "doc": "@required: true, @description: unique nds id for a construction address object, @examples: 35051923" }, { "name": "units", "type": [ "null", { "type": "array", "items": { "type": "record", "name": "Unit", "fields": [ { "name": "unitNumber", "type": [ "null", { "type": "string", "avro.java.string": "String" } ], "doc": "@required: false, @description: a specific unit number for an individual unit within a multi-dwelling unit, @examples: 1|101", "default": null }, { "name": "type", "type": [ "null", { "type": "string", "avro.java.string": "String" } ], "doc": "@required: false, @description: the type of the unit, @examples: Apartment|Building", "default": null }, { "name": "story", "type": [ "null", { "type": "string", "avro.java.string": "String" } ], "doc": "@required: false, @description: the story or floor number for the unit, @examples: 1|2|3", "default": null }, { "name": "fiberCount", "type": [ "null", { "type": "string", "avro.java.string": "String" } ], "doc": "@required: false, @description: the number of fibers available at the unit, @examples: 1|4", "default": null } ] } } ], "doc": "@required: false, @description: unit numbers will be available for multi-dwelling unit - demand points, @examples: unit number details", "default": null }, { "name": "constructionIndicator", "type": { "type": "string", "avro.java.string": "String" }, "doc": "@required: true, @description: construction stages (yes means in construction stage and no means in completed stage), @examples: yes|no" } ] }
Generate the associated C# code files using avrogen.
Now, execute consumer code(after publishing):
Consumer code can read Kafka message as "GenericRecord" successfully, but using SpecificRecord (as specified in code snippit above) :
Will result in Exception: Exception detail: Confluent.Kafka.ConsumeException: Local: Value deserialization error ---> Avro.AvroException: Unable to find type com.company.example.@event.unitsin all loaded assemblies in field emails\r\n at Avro.Specific.SpecificDefaultReader.ReadRecord(Object reuse, RecordSchema writerSchema, Schema readerSchema, Decoder dec)\r\n at Avro.Generic.DefaultReader.Read[T](T reuse, Decoder decoder)\r\n at Confluent.SchemaRegistry.Serdes.SpecificDeserializerImpl
1.Deserialize(String topic, Byte[] array)\r\n at Confluent.SchemaRegistry.Serdes.AvroDeserializer
1.DeserializeAsync(ReadOnlyMemory1 data, Boolean isNull, SerializationContext context)\r\n at Confluent.Kafka.SyncOverAsync.SyncOverAsyncDeserializer
1.Deserialize(ReadOnlySpan1 data, Boolean isNull, SerializationContext context)\r\n at Confluent.Kafka.Consumer
2.ConsumeImpl[K,V](Int32 millisecondsTimeout, IDeserializer1 keyDeserializer, IDeserializer
1 valueDeserializer)\r\n --- End of inner exception stack trace ---\r\n at Confluent.Kafka.Con sumer2.ConsumeImpl[K,V](Int32 millisecondsTimeout, IDeserializer
1 keyDeserializer, IDeserializer1 valueDeserializer)\r\n at Confluent.Kafka.Consumer
2.Consume(CancellationToken cancellationToken)I believe that for DotNet C#, if a schema namespace contains a C# key word (EG: event), then the AvroDeserializer fails.
NuGet Versions: Confluent.Kafka = 1.1.0 Confluent.Kafka.Avro = 0.11.6 Confluent.SchemaRegistry = 1.1.0 Confluent.SchemaRegistry.Serdes = 1.1.0
Operating system/Client configuration: Windows - DotNetCore 2.2, C#
I think the key issue for replication of this issue, is to create an Avro schema that includes the word "event" in the namespace, and that the schema includes a "sub-schema" of an array of objects - Please see the "Schema example" I provided above. Then create a Producer(using AvroSerializer) and a Consumer(using AvroDeserializer).