ipld / specs

Content-addressed, authenticated, immutable data structures
Other
592 stars 108 forks source link

schemas: rootType for advanced data layouts #208

Closed rvagg closed 4 years ago

rvagg commented 5 years ago

@mikeal has good example of how this is useful, when you have an integrated set of components that included advanced layouts, without rootType they don't connect cleanly. It's still unclear exactly what you'd want when you're consuming someone else's ADL and don't want to care about the internal pieces of it (our HashMap or Vector for instance). And, do I need a schema and to use our tooling to produce publish an ADL? Can't I make one entirely programmatic that you could pick off the shelf?

Anyway, optional gives us some wriggle room while we figure all of this stuff out.

mikeal commented 5 years ago

To be more specific, I’ve found it rather difficult to connect a schema to the data for an advanced layout in the implementation of the advanced layout itself. It’s actually not very practical to implement an advanced layout in a schema-gen system without having the data you’re working with also using schema-gen, so I really do want to attach a schema every time, and doing this dynamically in the implementation of the advanced layout turned out to be so cumbersome that I decided to just require that one be set up-front for js-ipld-schema-gen. This doesn’t necessarily mean that the schema language itself should make this mandatory, but it wil be mandatory for my API generator. If we find that the same is true in Go we should revisit whether or not to make this mandatory in the schema language.

This PR would get me out of my current hack (naming the rootType and the advanced layout the same name).

mikeal commented 5 years ago

It's still unclear exactly what you'd want when you're consuming someone else's ADL and don't want to care about the internal pieces of it (our HashMap or Vector for instance).

I wrote some code late last night that might be more illuminating.

I’ve broken the data layout entirely into its own schema and implementation file and plan to eventually publish them as a separate module. I do a full API generation with these using schema-gen, attach a few convenience methods, and export the type classes from the generator.

When I do the main unixfsv2 schema-gen I pass the type classes into the next schema-gen. This means that the main unixfsv2 schema depends on types that aren’t defined in that schema but are essentially required to be available as fully generated schema-gen type classes once you want to actually use the schema to generate an API.

The great thing about this approach is that it gets us completely out of needing any kind of import or reference syntax between schemas in the schema language itself. If a schema relies on an undeclared type an exception will occur when trying to generate an API for it. API generation and codegen can simply consume their own output from the dependent schema, leaving all of the details of how Advanced layout methods are defined and what the API’s look like for these types to the ecosystem and implementors of different schema-gen. It hardens the separation of IPLD Schemas from actual code and keeps the language a strict design language for content address block data while enabling the signaling for the creation of more advanced types.

rvagg commented 5 years ago

(all your links are wrong now you've rearranged your repo btw, have to navigate to src instead of lib)

One minor benefit of a rootType in this case where an advanced is in the middle of your schema and not the primary thing you're exporting is that you could use kinded unions because you can follow to the rootType to figure out the representation kind and make sure it works. That's probably a bit trickier in the case where you're importing an external advanced type unless there's a way to query the classes you're handing over to figure out the representation kind--or maybe each class could link to its own schema and type within that schema so it could figure it out programatically. This is not a primary concern right now but might become interesting if we want to do fully-inline data structures and have kinded unions involved in that process.

mikeal commented 5 years ago

all your links are wrong now you've rearranged your repo btw, have to navigate to src instead of lib

ya, i found out that aegir’s linter was actually ignoring lib :(

unions

One thing that we considered for a minute and then dropped recently was: maybe all Advanced Layouts should be unions.

I think we sort of dropped it because it was so different from what we’ve been doing and because it was a little heavy handed.

However, looking at the Data schema, maybe they should be. It would change the layout a little and where I would put the advanced logic, but it would also forcibly add even more future proofing which may actually be a good thing.

@warpfork I’m curious what you think. The idea was rather simple: all Advanced Layouts are unions and therefor have a union style syntax.

advanced Foo {
  | MyType “myType
} representation keyed

This first came up when we discussed how to best cover the schema migration problem in the context of advanced layouts, “what happens when you want to apply a different type.” To which my response was “if it wasn’t setup as a union you’re kind of screwed anyway.”

mikeal commented 4 years ago

@rvagg what’s the status of this PR?