ipld / specs

Content-addressed, authenticated, immutable data structures
Other
592 stars 108 forks source link

ADLs as a core part of IPLD? #266

Open rvagg opened 4 years ago

rvagg commented 4 years ago

Continuing a discussion from elsewhere where we're trying to clarify this graphic:

IPLD Diagram

Without quoting semi-private discussion, there is disagreement about the place of ADLs here. @mikeal wants to remove them from a graphic that is trying to define "what is IPLD".

@warpfork has been suggesting that their existence is important for the proper design of extensible IPLD implementations and omitting them may lead to suboptimal implementations that have difficulty being extended to all that we're trying to build.

I'm very sympathetic to the latter view and see its importance in go-ipld-prime and would love to see the same design make its way through the new Rust implementation. But we really don't have anything that looks like a proper ADL yet and we keep on talking as if they are concrete thing, which they aren't, and maybe that's not been helpful to people who aren't clued in to the way we're all conceptualising it. Even putting it in this graphic suggests that we have something to show, but we just don't.

I'm fine either way wrt this graphic. If we leave it out then we could re-insert it later when we get closer to the ultimate dream. Mostly I just want to isolate this particular discussion here.

rvagg commented 4 years ago

My vote: remove ADL entirely for now and leave the option of adding it back in once we have more concrete evidence that it's a core component. I like this option because it more accurately represents the state of IPLD and prevents us from having to explain something that's just not there, but it also gives us something to shoot for—it can be added in at a future date if/once ADLs become an integral part of our stack (FTR I still reckon they have to be because selectors and paths and other navigational tools need to be ADL aware for the stack to ultimately be useful for users).

warpfork commented 4 years ago

(Re-pasting from earlier discussion for reference availability)

The purpose of this image is to make it crystal clear that someone implementing an IPLD library in a new language will want to conceive of a node interface[‡] and expect that to handle all three of:

  1. {Data Model (plain)},
  2. {Schemas},
  3. and {Advanced Data Layouts}.

[‡] - if not literally an interface if writing in a language with strong types, then still certainly a loose "contract" of expected methods that can cover the basics like "lookup key in a map". Potæto potato.

The reason we got started making the draft this image and the reason we're having this discussion is because I want to communicate this key fact about interfaces and the need for some unified behaviors -- to, for example, some of the people making efforts in the Rust ecosystems (though the same info will need communicating to other groups over time; this was just the most pressing one at the time). If the need for unified behaviors over these three forms is not communicated, people developing libraries will have a much, much harder time figuring out the key details of their abstractions.

warpfork commented 4 years ago

I still don't really understand the arguments in favor of removing ADLs from the picture.

I estimate the odds that the concept of ADL will appear in the future and not be part of the "you'll want some sort of unified contract or interface for nodes" to be approximately 0%. That's not to say we can't also have things that have more function or more interfaces in various forms of library API; that's fine. But when I say "ADL", literally the thing that I mean is "I can treat this as a node and if its kind is map then I can use the LookupMapValueByStringKey semantic (or other map semantics) on it as would be normal for any other plain Data Model node". Where... else... would we draw this in relationship to Data Model, other than "extremely darn close"?

I don't think the key points are better made by leaving Schemas and Plain Data Model alone as a pair, either. That would make it much more likely for a prospective implementer of a new IPLD library in a new language to look at these docs and go "huh, well, okay, I'm not doing schemas in my first version ((fair!)), so then, I guess I'll be fine using a sum type for my Node type..." --> and then just leads directly to sadness later on when they try to engage with either ADLs or Schemas. The entire "different implementors of node" document in the Library Design Recommendations is about this, already. The pictures should match!

rvagg commented 4 years ago

I agree with this sentiment when viewing this from an implementer perspective (caveat at the bottom of this comment) - that making sure that implementations build abstractions such that interactions with data can occur at multiple layers above the data model as if they are interacting with the data model, and this includes view the schema "lens" and also through ADLs as a programmatic layer that hides some more complex behaviours (notably performing block loads transparently).

My interpretation of @mikeal's thinking on this is that the JS view of the world is hitting hard - where we can't just hide block loading behind a consistent interface because it involves async activity that isn't necessary when you're interacting with plain data model (and schema) data that you have locally. A newnode = node.TraverseIndex(100) operation in JavaScript (conceptually) would be fine when you have the data in memory and can do a synchronous array look-up (would even be fine for an ADL where you have all the data locally, which would be an odd case). To make room for ADLs being transparently inserted in between, we'd have to turn every such operation into a newnode = await node.TraverseIndex(100). Meaning that a simple foo = bar[100] involves a Promise resolution just because maybe there's an ADL in there that will do an asynchronous load! Which gets very inefficient very fast.

This is entirely hidden in synchronous languages, Go most notably here. I don't know about Rust though, the costs might be better managed than in JS since this is being built into the language from much earlier on. @vmx? I don't think this is a critical point, though, because your argument still stands that implementations should at least be aware of this possibility when designing abstractions so these pieces could be inserted in between. Without making that room, trying to add ADLs later might be quite painful (as they are in JS already!).

Where we might quibble on this graphic is this:

The purpose of this image is to make it crystal clear that someone implementing an IPLD library

I don't know if that is the purpose. It's useful for that, but I think mostly useful for our website and specs repo README to explain IPLD to users. Maybe for implementer notes we can go further than what's contained in this image. Designing a graphic just for consumption by implementers seems like excessive effort.

warpfork commented 4 years ago

I don't know if that is the purpose

Okay, fair. I should confine my statement to more like "It's what I had in mind in the earliest discussion".

To maybe recontextualize that particular part of my statements a bit: I'm mostly worried about scope creep for this image.

There have been a couple of questions if we should add X or Y or Z to this image, and so far, my feelings on that have mostly been "no": this image does enough already. In particular, I think if we made an image that listed every thing and subsystem and nearby library that's a thing we've worked on as part of IPLD, that image is one that gets cluttered fast... and while I think that image might be cool, I don't want to try to grow this one into it, because those are different things; and our prior attempts at such an all inclusive image for the specs/docs repos (whether actually an image or just an asciigram) have been... well, you've been here to see a couple of the jousts at this. They get messy. I'd love to trade down on inclusiveness and up on focus and shipping for this one.


[the entire rest of that comment]

:+1:


[the bits about interfaces over i/o]

Yeah, fair. I do regularly forget how incredibly empowering it is to work within a language that successfully shields me from the "What color is your function?" problem.

vmx commented 4 years ago

In regards to Rust. I know I've been warned by @warpfork often enough that I should put thought into this early on. I haven't put much thought into it yet. Reason is that it's still early and there is so much basic things (like not depending on a fixed codec code table) that I like to deal with first (which is already a big enough challenge). So I can't comment on how the Rust implementation will relate to it, but I'm on @warpfork's side that ADL should be there and we should try to make them as Data Model like as languages permit.