lerouxrgd / rsgen-avro

Command line and library for generating Rust types from Avro schemas
MIT License
34 stars 26 forks source link

Support recursive types #63

Closed rockwotj closed 2 months ago

rockwotj commented 6 months ago

The following schema causes a segfault for me, this schema is taken from the offical apache avro test suite: https://github.com/apache/avro/blob/main/share/test/schemas/interop.avsc

{"type": "record", "name":"Interop", "namespace": "org.apache.avro",
  "fields": [
      {"name": "intField", "type": "int"},
      {"name": "longField", "type": "long"},
      {"name": "stringField", "type": "string"},
      {"name": "boolField", "type": "boolean"},
      {"name": "floatField", "type": "float"},
      {"name": "doubleField", "type": "double"},
      {"name": "bytesField", "type": "bytes"},
      {"name": "nullField", "type": "null"},
      {"name": "arrayField", "type": {"type": "array", "items": "double"}},
      {"name": "mapField", "type":
       {"type": "map", "values":
        {"type": "record", "name": "Foo",
         "fields": [{"name": "label", "type": "string"}]}}},
      {"name": "unionField", "type":
       ["boolean", "double", {"type": "array", "items": "bytes"}]},
      {"name": "enumField", "type":
       {"type": "enum", "name": "Kind", "symbols": ["A","B","C"]}},
      {"name": "fixedField", "type":
       {"type": "fixed", "name": "MD5", "size": 16}},
      {"name": "recordField", "type":
       {"type": "record", "name": "Node",
        "fields": [
            {"name": "label", "type": "string"},
            {"name": "children", "type": {"type": "array", "items": "Node"}}]}}
  ]
}

System information

$ uname -a
Linux fastpanda 6.5.0-26-generic #26-Ubuntu SMP PREEMPT_DYNAMIC Tue Mar  5 21:19:28 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
rockwotj commented 6 months ago

First time I ran the tool:

$ ./rsgen/rsgen-avro ../redact/avro/schema.avsc src/avro.rs
Templating error: Ref `Name { name: "Node", namespace: Some("org.apache.avro") }` is not resolved. Schema: Ref { name: Name { name: "Node", namespace: Some("or
g.apache.avro") } }

If I remove the namespace for the top level record I get:

$ ./rsgen/rsgen-avro ../redact/avro/schema.avsc src/avro.rs
zsh: segmentation fault (core dumped)  ./rsgen/rsgen-avro ../redact/avro/schema.avsc src/avro.rs
martin-g commented 6 months ago

Here it fails with:

./target/release/rsgen-avro interop.avsc -

thread 'main' has overflowed its stack
fatal runtime error: stack overflow
fish: Job 1, './target/release/rsgen-avro int…' terminated by signal SIGABRT (Abort)
martin-g commented 6 months ago

There are two issues:

  1. Usage of Schema::Null - rsgen-avro does not support it
  2. The last field - recordField that uses recursion (i.e. Schema::Ref). Removing this field solves the issue (stack overflow)
rockwotj commented 6 months ago

Thanks for the quick response. I think the null one is fine, I think the stack overflow is more important. I'll update this issue to support that feature

lerouxrgd commented 6 months ago

The recursive schemas are indeed not supported, tbh I wasn't even aware of it. @martin-g do you know what a valid Rust struct corresponding to this schema (w/o the null field) would look like ? I guess here Box<Node>, but is that actually supported by apache-avro ?

lerouxrgd commented 6 months ago

Actually it fails when determining whether the generated struct is Eq or not, this boils down to looking for a "float" recursively (which here stack overflows).

If you change {"name": "label", "type": "string"} to {"name": "label", "type": "float"} it manages to generate the code. So it shouldn't be too hard to fix.

lerouxrgd commented 5 months ago

@rockwotj I have added support for recursive types on the branch apache-avro-0.17, could you give it a try ?

rockwotj commented 5 months ago

Works well for me! Thanks.