ipld / go-ipld-prime

Golang interfaces for the IPLD Data Model, with core Codecs included, IPLD Schemas support, and some handy functional transforms tools.
MIT License
133 stars 50 forks source link

Representation of kinded Unions #313

Closed mark-ten9 closed 2 years ago

mark-ten9 commented 2 years ago

I am currently working with IPLD using go-ipld-prime and am looking to implement some custom ADLs. While experimenting with https://github.com/ipld/go-ipld-adl-hamt and with generating code based on my own custom schema it seems that the way Unions with representation set to kinded are serialized is inconsistent with the specs at https://ipld.io/docs/schemas/features/representation-strategies/#union-kinded-representation. That spec seems to indicate that the union should be represented as one of the possible types without any wrapping, but the code generated by this project results in the content of the union being wrapped in a map with the key of the type name and a value. Can someone explain this difference?

rvagg commented 2 years ago

Sorry for the very long delay @mark-ten9!

When operating with unions in go-ipld-prime, it exposes them as a Map via the data model because there's not really any other better options of doing so. Because Go doesn't have a proper union type, we don't have a good way to represent them programmatically. So we expose them as a map, to be consistent with the Node interface which is intended for the data model layer, not the typed layer.

But, they should encode as what is advertised on the tin - kinded unions should just come out as a single value, like a proper union.

If you're having problems with unions at the encoding stage and they're coming out as maps, then you might be hitting a footgun we've discussed possibly needing to plug at the encoding layer - it's possible to encode a TypedNode as a typed node, rather than as its representation as a data model Node. In practice this means we typically have this kind of boilerplate close to encoding:

if tn, ok := node.(schema.TypedNode); ok {
  node = tn.Representation()
}
// encode `node`

When you skip the Representation() call of a TypedNode, you can encode the logical typed form of the node, which is usually not what you want at the encoder level.

I hope this helps, I'm going to close this for now but feel free to continue the discussion if I haven't satisfactorily answered your query; or perhaps there's a bug here I'm missing that needs to be addressed.

mark-ten9 commented 2 years ago

Thanks! This was very helpful.