confluentinc / confluent-kafka-dotnet

Confluent's Apache Kafka .NET client
https://github.com/confluentinc/confluent-kafka-dotnet/wiki
Apache License 2.0
77 stars 866 forks source link

Avro serializer? #67

Closed simplesteph closed 6 years ago

simplesteph commented 7 years ago

Is there an example / implementation of the avro serializer for this library? This way we can use it alongside the schema registry which would be great

mhowlett commented 7 years ago

Hi @simplesteph - not yet, however this is pretty high on our priority list after initial release.

simplesteph commented 7 years ago

Awesome! When would that initial release be roughly? days? weeks? months? (just trying to estimate which library we should be using)

mhowlett commented 7 years ago

scope is locked down and are we testing and documenting. can't make any guarantees but not far off.

amccague commented 7 years ago

@mhowlett Any indication of support for newer avro features such as Logical Types?

ewencp commented 7 years ago

@amccague We wouldn't reimplement Avro ourselves, so support would depend on what the library we use provides. I'd imagine the official C# library from Apache would support these types.

mhowlett commented 7 years ago

Microsoft seem to have an Avro library as well: https://hadoopsdk.codeplex.com/wikipage?title=Avro%20Library

We haven't assessed either at this point - opinions welcome.

Neither seem to support .NET Core which is unfortunate.

treziac commented 7 years ago

I used the microsoft avro library on my side, with some changes. Porting it to .net core - wasn't hard, and some people did it on Github, but microsoft azure team didn't seam to give more thoughts at it (https://github.com/Azure/azure-sdk-for-net/issues/2322). Also they didn't port avro to their new main branch and do not really reply regarding it (https://github.com/Azure/azure-sdk-for-net/issues/2663)

I made some assertions at beginning (as just discovering avro) by generating avro schema from strongly typed Objects rather than the opposite, and had to make small changes (for datetime support, default values and other). But globally it's working ok

There is also the C# apache Avro Library (https://github.com/apache/avro/tree/master/lang/csharp, which do'esn't seem to have been updated since 2 years and miss proper doc for quickstart, never posted an issue on their side though - but that's part of why I choosed the microsoft library) Someone posted an implementation of the schema registry rest api in .net using it : https://github.com/jakobz/schema-registry-dotnet I didn't test it, but i assume it would be easy to bind it for serialization here. Did kind of the same with the microsoft lib, but not open sourced

Both library use avro specs 1.7.7 if i remember correctly (same as kafka schema registry) An important thing to do (i didn't) would be to map the logical types as the java kafka client, like date type and other, for kafka connect compatibiliy

mhowlett commented 7 years ago

thanks for the input @treziac

here's another relevant repo: https://github.com/Judopay/Judo.Kafka - also uses the Microsoft library.

jakobz commented 7 years ago

I'm maintainer of the https://github.com/jakobz/schema-registry-dotnet repo. The library works fine for us in the AVRO+"Schema registry" setup.

There's https://github.com/jakobz/schema-registry-dotnet/blob/master/SchemaRegistry/RegistryAwareSerializer.cs class, which handles the confluent protocol in messages, and AVRO schema publising/retrieval.

robertruetz commented 7 years ago

I'm already using avro to serialize my messages and would like to produce them without serialization. However, the comments around the Producer .ctor indicate that method may be made private in favor of the Serializing Producers. What am I supposed to do if the payload I want to pass in is already an avro serialized byte array?

treziac commented 7 years ago

Simply use a <byte[]> serializer

purkhusid commented 7 years ago

Any news on this?

treziac commented 7 years ago

From what I know, future goal is to add a C# binding on https://github.com/confluentinc/libserdes. I don't know the priority for this on confluent side.

Currently, you can easily use/adapt one of the community project which implement a schema registry client, like https://github.com/Judopay/Judo.Kafka or https://github.com/jakobz/schema-registry-dotnet to create an ISerializer usable in Confluent.Kafka (judo.Kafka actually do that but on older version) Serializer on master from Confluent.Kafka now support topic in thir interface, so it will be possible to integrate avro easily (without having to create a separate serializer for each topic to use)

If you want to use 0.9.5 or 0.11.0 (RC), you will have to create a base Producer and call GetSerializingProducer on it for each topic where you need to use avro (so that you can have a serializer for a given topic)

If you have trouble, I can provide some samples

thara56 commented 7 years ago

Appreciate if you can give some samples @treziac ...

treziac commented 7 years ago

Ok, will to this today ;)

treziac commented 7 years ago

@thara56 and other Didn't have time yesterday, had some this afternoon :) Using judo kafka was not immediate, some members are not public.

I forked the project and made some modifs (added ISerializer and IDeserializer and made public api), you can fork and try the sample here (adjust to your own servers) https://github.com/treziac/Judo.Kafka/tree/master/test/Sample

Main problem is still that there are not good Avro library for C# (timestamp, logical type... are not or not well supported) and none for specificity of confluent. But you can still use it for simple object ;)

I may rework the one I did at work (with a personnal fork of original Microsoft.Hadoop.Avro which no commit history), but as I can't post the code I made at work at that time, it may never happen (and it was far from perfect and used as a proof of concept - we don't use avro for now in our env, only json)

anupkumarsharma commented 7 years ago

Little late but do we have any good option? I was trying to convert object to avro. Java has ReflectDatumWriter but no such option in c#.

treziac commented 7 years ago

I didn't took time to finish it properly, but you can check https://github.com/confluentinc/confluent-kafka-dotnet/pull/319 It is compatible with netframework and netcore2.0 (but not netcoreapp1.x)

Have to finish it for months now ^^'

mhowlett commented 7 years ago

we're not involved in that project. very soon I'm going to start working on #351 which adds avro/schema registry support to this library.

kavyashivakumar commented 6 years ago

@treziac in your sample https://github.com/treziac/Judo.Kafka/tree/master/test/Sample how is logicaltype taken care of? you mentioned it can be done for simple objects. i'm looking for "date". i am trying to use MS avro library

JSkimming commented 6 years ago

Has this been implemented? It looks like it's in the release notes of 0.11.4.

rnpridgeon commented 6 years ago

I believe so here is the PR

https://github.com/confluentinc/confluent-kafka-dotnet/pull/357

mhowlett commented 6 years ago

for more information, have a look at the examples: https://github.com/confluentinc/confluent-kafka-dotnet/tree/master/examples

JSkimming commented 6 years ago

@rnpridgeon, @mhowlett Thanks.

I highlighted this because this issue was top when I searched Kafka Avro Serialization .NET, and because the issue is open my first impression was it is not supported.

Only after finding the release notes did I think different.

anupkumarsharma commented 6 years ago

Not sure, if it completely helps - https://github.com/anupkumarsharma/KafkaAvroNet/blob/master/Core/KafkaAvroNet.Avro/Providers/ReflectionSerializationProvider.cs.

It's work in progress where more data type support need to be added.