BurntSushi / toml

TOML parser for Golang with reflection.
MIT License
4.59k stars 529 forks source link

How to unmarshal tagged unions? #401

Closed cal-pratt closed 10 months ago

cal-pratt commented 10 months ago

How can I write my structs/parsing logic to handle unmarshalling tagged unions? I have documents with deeply nested content, and many cases where there are lists with various different possible types in the one container.

For example, say I have the following toml

[[items]]
  foo = "foo"
  [items.data]
    fizz = "fizz"
    buzz = "buzz"
[[items]]
  bar = "bar"
  [items.data]
    fizz = "fizz"
    buzz = "buzz"

Which in yaml looks like this:

items:
- foo: foo
  data:
    fizz: fizz
    buzz: buzz
- bar: bar
  data:
    fizz: fizz
    buzz: buzz

This is as close as I could get, but it still doesn't feel quite right.

type Data struct {
    Fizz string
    Buzz string
}
type Foo struct {
    Foo string
    Data Data
}
type Bar struct {
    Bar string
    Data Data
}
type Doc struct {
    Items []struct{
        *Foo
        *Bar
    }
}

This example breaks down if you have any overlapping contents. A random sub-struct will be chosen for the contents if its present in both, meaning the data would be hard to grok.

I've also looked at using the unmarshal overrides on the structs like Unmarshaler.UnmarshalTOML, but this gives me only a map to work with. Right now, I still need to unmarshal the Data structs. I just want to tell the decoder what struct to start using and then continue on with its normal logic. If I use UnmarshalTOML near the root of the document, I have to manually assemble the entire structure, which defeats a lot of the purpose here.

Any thoughts on this? Or am I missing something simple?

cal-pratt commented 10 months ago

I was thinking maybe something like this:

type TaggedUnion interface {
  // Return the sub field of the union to work on
  DetermineType(data any) any
}

type Item struct {
  *Foo
  *Bar
}

func (i *Item) DetermineType(data any) any {
  d := data.(map[string]interface{})
  if _, ok := d["foo"]; ok {
    return i.Foo
  }
  return i.Bar
}

After the parser finishes calling DetermineType, it would use that returned value to continue the rest of the unmarshalling.

arp242 commented 10 months ago

Your code seems alright. Or just add both a Foo and Bar string field under items because other than that they're identical – that seems simpler to me but I don't know the full context.

Other than that you can make it as complex as you want I suppose. I personally wouldn't do that and just "if foo {...} else {...}" seems a lot easier, but it's not for me to tell you how to write your code, especially since I don't really know the full details.

arp242 commented 10 months ago

I'd just use struct fields. It's simple, straight-forward, and works. You can optionally add some helper methods if that's useful in your access patterns.

But like I said, it's your code, not really for me to decide.

cal-pratt commented 10 months ago

So I came up with a solution for anyone else who might need this behavior. I combined your parser with mapstructure so I can mutate the dicts as they get marshaled. First decode the items as a map[string]any, and hand it off to mapstructure to fill in the struct. A custom mapstructure hook allows mutating the objects as they come in. This isn't quite as efficient I imagine, but it does the trick.

type PreprocessElem interface {
    PreprocessElem(any) (any, error)
}

func LoadConfigFile(path string, target any) error {
    content, err := os.Open(path)
    if err != nil {
        return err
    }
    table := make(map[string]any)
    tomlDecoder := toml.NewDecoder(content)
    if _, err := tomlDecoder.Decode(&table); err != nil {
        return err
    }
    hook := func(from reflect.Value, to reflect.Value) (any, error) {
        if value, ok := to.Interface().(PreprocessElem); ok {
            return value.PreprocessElem(from.Interface())
        }
        return from.Interface(), nil
    }
    decoderConfig := &mapstructure.DecoderConfig{Result: target, DecodeHook: hook}
    decoder, err := mapstructure.NewDecoder(decoderConfig)
    if err != nil {
        return err
    }
    return decoder.Decode(table)
}

Example hook plugin:

type Item struct {
    *Foo
    *Bar
}

func (i Item) PreprocessElem(elem any) (any, error) {
    if data, ok := elem.(map[string]any); ok {
        if _, ok := data["foo"]; ok {
            return map[string]any{"foo": data}, nil
        }
        if _, ok := data["bar"]; ok {
            return map[string]any{"bar": data}, nil
        }
    }
    return elem, nil
}
cal-pratt commented 10 months ago

Closing this, happy enough with this workaround.