WebAssembly / component-model

Repository for design and specification of the Component Model
Other
945 stars 79 forks source link

Nested interfaces #372

Open macovedj opened 3 months ago

macovedj commented 3 months ago

Here is an outline for the various ways in which one could describe nested interfaces in wit.

For additional context re: motivation, see this issue, though generally what this enables is the ability to target nested instances which are expressible in wat and binary formats, and not in wit.

An initial implementation that still needs some work can be found here, though at the moment, it would only add support for the case of nesting interfaces from foreign packages.

Perhaps this PR can also be a place to hash out a bit of what the implications would be from a guest lang bindgen perspective as well.

rossberg commented 3 months ago

It’s great seeing this expressivity gap in WIT addressed. However, I don’t think the proposed extension is the right solution for this. It is very special-cased and fails to address closely related or more general use cases. For example, it's quite natural to want to express an interface of the form

    interface {
       a1 : A
       a2 : A 
    }

where both sub elements have the same interface A, but without also defining a standalone A on the outer level. Similarly, a common case are components like

    world {
        import i : A;
        export e : A;
    }

where the same signature occurs for a generic import and an export. Or perhaps even only two imports:

    world {
        import a1 : A;
        import a2 : A;
    }

And endless other variations.

The actual root of the problem is that WIT — unlike the raw component system — conflates two entirely different forms of declaration, namely, the named definition of an interface type (an actual interface per se) and declaring the presence of an instance of an interface (the implementation of an interface).

For perspective, compare this to a regular programming language, where we usually distinguish

    type T = struct { a : A; b : B }

from

    export T : struct { a : A; b : B }

and these are entirely different things, declaring very different categories of T's. I doubt anybody would consider merging both into a single form of declaration a helpful thing to do.

An addition like the one proposed in this PR is not fixing this underlying category error, but arguably makes the conflation worse, very likely requiring further ad-hoc work-around features in the future.

The more adequate and scalable solution would be to properly separate these notions in WIT as well. That is, introduce a new form of declaration for naming an interface type, that then can be used in all places where an interface description would occur, but does not by itself declare an instance.

The example from the PR would become something along the lines of

    interface type Foo {
        ...
    }

    interface foo : Foo

    interface top {
        foo : Foo;
        bar : foreign:pkg/Bar;
        baz : interface {
            ...
        }
    }

but importantly, you don’t have to declare the existence of a global foo if all you actually want is to name the type Foo, because it's used in multiple places.

macovedj commented 3 months ago

Hmm, can we talk about this in terms of how it would be encoded? Given that today the interface syntax is encoded as follows:

interface foo {
  ...
}
(component
  (type (;0;)(instance 
  ...
  ))
  (export "ns:id/foo" (type 0))
)

We can see that interface is currently encoded as a type that is exported.

Note in the proposal, the nest keyword is exporting an instance rather than a type, which is a difference between nested interfaces and top level interfaces.

I may be misinterpreting, but it feels like the interface keyword currently behaves how I would intuitively expect the proposed interface type syntax to behave, and it feels like nest is implicitly operating in a way that feels similar to foo: Foo.

Given the current usage of interface, if we were to address the larger concerns here, would it maybe make more sense to go the opposite direction?

Thinking of interface as operating currently how interface type is proposed to behave, instead we could do something like

interface foo-type {
  ...
} 

foo: instance foo-type;

interface top {
  foo-inst: foo,
  top-foo-type: foo-type
}

Or using @rossberg's initial suggestion, not even introduce the instance keyword and just have

foo: foo-type

but allow it to act on an interface rather than an interface type

Then top has the liberty of exporting either or both top-foo-type as a type like what we currently see in top level interfaces, or a foo-inst as an instance, as is originally proposed in this PR. If I'm reading correctly, this feels like it accomplishes disambiguating without changing the current semantics of interface.

lukewagner commented 3 months ago

@rossberg In your example WIT, you're giving all 3 nested interfaces a plainname, i.e., they turn into the following instance type:

(type $top (instance
  (export "foo" (instance ...))
  (export "bar" (instance ...))
  (export "baz" (instance ...))
))

That's fine, and we can quibble about the right syntax for addressing all the use cases you're getting at, but that is all quite independent from the use case we are trying to address in this PR which is that we want an export of an interfacename (that has been separately defined and named). e.g., we want:

(type $top (instance
  (export "wasi:http/incoming-handler" ...)
))

which is this PR is proposing looks like:

interface top {
  nest wasi:http/incoming-handler;
}

Notice that we are quite intentionally not assigning a locally made-up plainname; we want to export the nested instance by specifying only its interfacename. I don't think any of the syntax you're proposing addresses this use case?

macovedj commented 3 months ago

Some of the other concerns raised in the PR that implements nesting foreign packages included if we've considered what this should look like from the perspective of guest bindings. Spitballing with rust, currently if we have the following wit

package foo:bar@1.0.0;

interface things {
  record my-record {
    foo: string
  }
}
world my-world {
  export things;
}

We end up with the following

#[allow(dead_code)]
pub mod exports {
  #[allow(dead_code)]
  pub mod foo {
    #[allow(dead_code)]
    pub mod bar {
      #[allow(dead_code, clippy::all)]
      pub mod things {
        ...
     }
   }
  }
}

Does it make sense to just continue to nest mods? So using the examples above:

nest foo;
nest foreign:pkg/bar;
nest baz;
...

then things above could look as follows:

pub mod things {
  ...
  pub mod foo {
    ...
  }
  pub mod foreign {
    pub mod pkg {
      pub mod bar {
        ...
      }
    }  
  }
  pub mod baz {
    ...
  }
}

I'm guessing I may be missing something, though I'd guess that other languages namespace things based on packages/interfaces similarly...

macovedj commented 2 months ago

I guess the other thoughts worth proposing for bindgen are more detailed mechanics of generating interfaces/traits.

I'd guess that if we have a situations where say if interface A has a generated trait definition GuestA and also nests interface B with a generated trait definition GuestB, then GuestA would probably have GuestB as an associated type, where the associated type is namespaced as outlined above.

lukewagner commented 2 months ago

Good points! Because nested interfaces have a totally distinct identity (of contained types and functions) from any other nesting or top-level import/export of the same interfacename, I think you're right in your above Rust code to nest foreign:pkg/bar under things so that it's distinct from, e.g., an import foreign:pkg/bar in the same world. To your second question about traits, though: I think they would also need to be kept distinct so that there is a separate trait (which can have separate concrete resource types) for each nested occurrence of an interfacename.

Maybe not interesting, but one corner case worth considering is what if we have:

interface things {
  foreign: interface { ... };
  nest foreign:pkg/bar;
}

This is allowed b/c the first nested interface is given the plainname foreign and the second nested interface is given the interfacename foreign:pkg/bar (which are distinct and thus allowed) but with the above scheme, both will try to stick a pub mod foreign into the things module. I assume the right solution here is to handle this rare collision with some avoidance scheme (maybe add a suffix character to the foreign namespace). I believe this sort of collision also arises at world-level (replacing nest with import), so it'd be good to know what we do there and probably copy that.