WebAssembly / component-model

Repository for design and specification of the Component Model
Other
914 stars 78 forks source link

Evolving packages over time #262

Open rylev opened 9 months ago

rylev commented 9 months ago

I have a wit package that I am evolving over time with some interfaces changing and others staying the same. I'm running into an issue where a world definition for the new package version is a mishmash of interfaces, types, and functions from multiple versions of the package.

Let's imagine two versions of the same wit package (I've elided interface definitions for brevity; hopefully it's still easy to understand):

Version 1

package acme:widget@1.0.0

world host {
  import foo
  import bar
  export baz
}

Let's imagine a version 2.0.0 where only interface foo has changed. The other two interfaces are structurally identical between versions. I have a few choices how to model this.

Version 2 Alternative 1

I can directly reference the old version so that the id of the interfaces does not change at all:

package acme:widget@2.0.0

world host {
  import foo
  import acme:widget/bar@1.0.0
  export acme:widget/baz@1.0.0
}

This has the advantage of directly modeling reality. The bar and baz interfaces are not only structurally identical, they are literally the same interfaces in both packages. Hosts and guests only need to change their understanding of the foo interface, everything else remains the same.

This has some disadvantage, however. The user is now targeting interfaces across multiple package versions even though logically the world is associated with package version 2.0.0. This might become confusing to users who are unsure why they need to target different versions of interfaces.

Version 2 Alternative 2

Alternatively, I can copy all the interface definitions over to 2.0.0 and require users of the world to update to target interfaces with new names:

package acme:widget@2.0.0

world host {
  import foo
  import bar
  export baz
}

This has the advantage of a being a tidy world.

However, hosts and guests will have to update since the names of the interfaces have changed even if they logically are the same. For guests this should hopefully be little change (though wit-bindgen for Rust does encode the version in the Rust modules it generates, so it's not completely without change).

This gets really interesting with hosts that want to support both versions of acme:widget. They will have to have host implementations for all interfaces even those that are structurally identical.

Question

So what is the right path forward here? Which of the alternatives do we want to encourage folks to choose? Is there something the component model can do to ease the path forward?

One possible solution might be to allow aliasing interfaces. If tooling were made aware that acme:widget/bar@1.0.0 and acme:widget/bar@2.0.0 are supposed to be the exact same interface, we could generate code on behalf of the user so that host implementations don't need two different implementations.

However, perhaps the component model doesn't need to change at all here, and we just need to update tooling so that it can be aware of when interfaces are structurally identical so that host and guest implementations can be shared across them.

lukewagner commented 9 months ago

(sorry for the slow reply; back from wasm CG meeting). That's a really good question; thanks for clearly laying it out!

So before getting to the question of what happens on major/breaking version changes, we can think through what we expect to happen with minor/non-breaking version changes. Minor version changes will be relatively more frequent so I would expect everyone wants to write what's in your "Alternative 2". Furthermore, to avoid requiring the embedder to have to manually enumerate all supported minor (and patch) versions (of which there could be hundreds), I think we'd expect the host to have some amount of semver logic baked in, so that, e.g., when an implementation of foo:bar/baz@1.1.4 is registered with the runtime, the runtime will by default provide this implementation to components that import foo:bar/baz@1.1.3. Thus, I think in any case hosts will be doing some amount of interpretation of version strings.

With major version changes, of course the host wouldn't automatically supply, e.g., a @2.0.0 implementation for a @1.0.0 import, but you could imagine that, as part of the host interface for registering an implementation of an interface, there is an optional list of older major versions that this implementation also supports (requiring the types must be compatible). So that could be a thing the host does without any additional Component Model support.

But it is a good idea/question of whether we can provide a more first-class support for this scenario. Let's say WIT let you write (probably awful concrete syntax, but bear with me):

package acme:widget@2.0.0

interface foo compatible-with(1.0.0) {
  bar: func()
}

This could then be encoded in a component type (assuming #248) as a re-export:

(component
  (type $foo (component
    (import "acme:widget/foo@1.0.0" (instance $old-foo
      (export "bar" (func))
    ))
    (export "acme:widget/foo@2.0.0" (instance $old-foo))
  ))
  (export "foo" (type $foo))
)

which seems to capture the semantics of what's going on here. This is coincidentally a nice supporting argument for the tweak @alexcrichton just suggested (enabling the instance re-export).

I don't have much of an opinion on the priority for making this addition or what the best concrete syntax is, but I think the idea generally makes sense.

martinitus commented 6 months ago

One thing that comes to my mind when reading above discussion:

Structural identity of two interfaces (e.g. "the same set of exported functions with identical signatures") is not necessarily the same as compatibility of the two interfaces. Simple example: func greet(user: string) could have a change of semantics where the user must be an email address in one version but not the other.

My gut feeling would therefore also be alternative 2, simply to avoid the risk of overlooking a change of semantics while the interfaces are still structurally identical.

lukewagner commented 6 months ago

Yes, definitely we should avoid ever taking structural type compatibility to imply "the same or compatible interface" for the reason you mention. This is the reason why we generally consider WIT interfaces to be "nominal", in the sense that the name of the interface is an important part of the definition; it's not just a typedef for a structural type. But of course the devil is in the details of how this fact manifests itself throughout the toolchain and default behaviors.