actgardner / gogen-avro

Generate Go code to serialize and deserialize Avro schemas
MIT License
365 stars 85 forks source link

Generated constructor code calls NewUnionNull*() which is overwritten by nil #178

Closed i7tsov closed 2 years ago

i7tsov commented 2 years ago

I've noticed some weirdness in the generated code:

func NewSample() Sample {
    r := Sample{}
    r.Number = 0
    r.NestedArray = NewUnionNullArrayNested()

    r.NestedArray = nil
    r.Mapping = NewUnionNullMapString()

    r.Mapping = nil
    return r
}

Avro schema contains unions of null&array and null&map. The result of NewUnionNullArrayNested() is instantly overwritten with nil (which is the default value for the field), allocating unnecessary memory on the heap. As far as I can see, there are no side effects of NewUnionNullArrayNested() function.

func NewUnionNullArrayNested() *UnionNullArrayNested {
    return &UnionNullArrayNested{}
}

Is that intended?

actgardner commented 2 years ago

Hey @i7tsov, can you post the Avro schema that produces this code?

i7tsov commented 2 years ago

Sure.

{
  "type": "record",
  "name": "Sample",
  "namespace": "foo",
  "fields": [
    {
      "name": "number",
      "type": "int",
      "default": 0
    },
    {
      "name": "nestedArray",
      "type": [
        "null",
        {
          "type": "array",
          "items": {
            "type": "record",
            "name": "nested",
            "fields": [
              {
                "name": "name",
                "type": "string",
                "default": ""
              },
              {
                "name": "count",
                "type": "int",
                "default": 0
              }
            ]
          }
        }
      ],
      "default": null
    },
    {
      "name": "mapping",
      "type": [
        "null",
        {
          "type": "map",
          "values": "string"
        }
      ],
      "default": null
    }
  ]
}

The root cause here from my understanding is that the field is a union of null and array (or map), and the default value is null. But generated code can be optimized to not create the instance of values other than null in such case.

actgardner commented 2 years ago

@i7tsov I've fixed this so that when the union field's default is null, the record constructor skips instantiating the struct.