AdrianStrugala / AvroConvert

Rapid Avro serializer for C# .NET
Other
97 stars 27 forks source link

Schema generation doesn't handle generics #159

Open Robospecta opened 2 months ago

Robospecta commented 2 months ago

What is the bug? Generating schemas using AvroConvert.GenerateSchema does not seem to support the use of generic types. The schema generated when using generics that contain different type arguments result in an avro record schema type with the same name.

The outcome of this for me is a nasty exception when I try to save this schema to azure event hub schema registry. Some information has been redacted

{"error":{"code":"InvalidRequest","message":"Avro schema validation failed: Duplicate schema name Field1."}}

This is because the when utilising the generic class on properties of a schema with different types, those records define a schema for the record with a duplicate type name created. This is best understood in the simple example provided below where the record type schema name for both properties is the same. It seems to append the number of generic arguments to the class name for the record type name, but doesn't take into account the concrete type passed to the generic.

How to reproduce? https://dotnetfiddle.net/ZZh8ux

using System;
using SolTechnology.Avro;

public class Program
{
    public class Field<T>
    {
        public T Value { get; set; }

        public float Confidence { get; set; }
    }

    public class Schema
    {
        public Field<string> Subject { get; set; }

        public Field<int> Mark { get; set; }
    }

    public static void Main()
    {
        var schema = AvroConvert.GenerateSchema(typeof(Schema));
        Console.WriteLine(schema);
    }
}

What is the Avro schema?

{
  "name": "Schema",
  "type": "record",
  "fields": [
    {
      "name": "Subject",
      "type": {
        "name": "Field1",
        "type": "record",
        "fields": [
          { "name": "Value", "type": "string" },
          { "name": "Confidence", "type": "float" }
        ]
      }
    },
    {
      "name": "Mark",
      "type": {
        "name": "Field1",
        "type": "record",
        "fields": [
          { "name": "Value", "type": "int" },
          { "name": "Confidence", "type": "float" }
        ]
      }
    }
  ]
}

What is the Avro data? Fill up the section or provide a sample file N/A. Just generating schema. Please see above.

What is the expected behavior? When generating a schema from a class that uses properties involving generics. The type name for the record schemas should not only include the class name and number of generic arguments, but also the type name passed to the generic.

{
  "name": "Schema",
  "type": "record",
  "fields": [
    {
      "name": "Subject",
      "type": {
        "name": "Field1String",
        "type": "record",
        "fields": [
          { "name": "Value", "type": "string" },
          { "name": "Confidence", "type": "float" }
        ]
      }
    },
    {
      "name": "Mark",
      "type": {
        "name": "Field1Int",
        "type": "record",
        "fields": [
          { "name": "Value", "type": "int" },
          { "name": "Confidence", "type": "float" }
        ]
      }
    }
  ]
}