julianpeeters / sbt-avrohugger

sbt plugin for generating Scala sources for Apache Avro schemas and protocols.
Apache License 2.0
133 stars 50 forks source link

Support for top level type definitions #66

Open zhoekstra opened 5 years ago

zhoekstra commented 5 years ago

Our company has a set of defined avro schemas which we're trying to use avrohugger on.

In these avro schemas, they have defined a top-level schema for fixed decimals:

{
    "type": "fixed",
    "size": 8,
    "namespace": "com.company.namespace",
    "name": "CustomDecimal",
    "logicalType": "decimal",
    "precision": 18,
    "scale": 6
}

This type is then referenced throughout other avro schemas like so:

{
    "namespace": "com.company.namespace.other",
    "type": "record",
    "name": "Operation",
    "fields": [
        {
            "name": "OperationType",
            "type": "com.company.namespace.other.OtherRecord"
        },
        {
            "name": "OperationAdjustment",
            "type": "com.company.namespace.CustomDecimal"
        },
        {
            "name": "OperationMode",
            "type": "com.company.namespace.other.ModeRecord"
        },
        {
            "name": "OperationValue",
            "type": "com.company.namespace.CustomDecimal"
        }
    ]
}

This allows us to define one fixed standard for decimal precision and scale across multiple records as we build our library of company known types.

Unfortunately, avrohugger does not yet have support for top level types. I'm not sure how to best represent that, but my guess right now would be to define them as a scala type - ie if avrohugger sees the top level avro schema above, it would generate something like the following:

package com.company.namespace
type CustomDecimal = BigDecimal
julianpeeters commented 5 years ago

Currently avrohugger generates records and enums as top-level definitions, but doesn't yet support fixed types, nor type aliases. Those sound useful to add eventually, or I'd be open to reviewing a PR, but my queue is pretty full these days, so the features are likely a ways off.