WebAssembly / WASI

WebAssembly System Interface
Other
4.75k stars 243 forks source link

Make release process more modular #450

Closed linclark closed 2 years ago

linclark commented 2 years ago

As discussed in the June 17 meeting (with an update on October 21), we have been working to make it easier to propose WASI APIs. This includes the previously discussed work on a more ergonomic syntax for defining APIs, a proposal template, and a simplified repo structure, which are all close to completion.

In addition to those changes, we need to look at the release process. The plan for WASI has evolved since its inception, but the way we thought of the release process didn’t change to reflect that.

Problems with the current release process

WASI is a much more modular set of APIs than are traditionally part of a system interface. But our process is to group all of these APIs into a single “snapshot” for release.

This causes the following problems.

  1. A snapshot has a single integer release number. With all proposal updates being grouped under this single version, there’s no sensible way to use semantic versioning to communicate about API changes.
  2. Most hosts will only support a subset of APIs, yet we bundle all the APIs into a single file. This means that consumers need to add a processing step to remove unnecessary APIs.
  3. The snapshots are in the main WASI repo, even though the work happens across different repos. Creating a single artifact in the WASI repo creates confusion about where issues should be filed.

Proposed new release process

Let’s look at another possible way of doing things. Instead of releasing a monolithic snapshot and versioning that snapshot, each API would cut its own releases.

These releases would still go through the WASI standardization process, which means that we would vote before major and minor releases. These releases would then be listed in an additional column in Proposals.md.

This release process would maintain the rigor of the standardization process, but would do so in a way that reflects the modularity of the APIs and addresses the issues above by:

  1. Making it possible for APIs to follow semantic versioning
  2. Allowing hosts to easily specify exactly which APIs that they want to support before feeding them through tooling
  3. Providing a single source of truth (the proposal repo) for all APIs

I will be bringing this proposal up for discussion in the October 21 meeting, and if no major concerns come up during the conversation, plan to propose a vote for the subsequent meeting.

sbc100 commented 2 years ago

Regarding semantic versioning in particular, there was quite a lot of discussion around this early on I think we decided it didn't make sense to use this. One example point that I remember from those discussions is that the notion of bugfix might make sense for an implementation but doesn't make sense for an API itself. There were other reasons too, but I would have to go back and dig up the documents/discussions from the that time.

linclark commented 2 years ago

One example point that I remember from those discussions is that the notion of bugfix might make sense for an implementation but doesn't make sense for an API itself.

We discussed this in today's meeting, but I want to make sure it's reflected here for those who couldn't attend.

This is one of the things that has evolved since the start of WASI. With the tool that Alex has been working on, we can generate language bindings for APIs. The current thinking is that this tool would use the comments from the interface definition as comments in the generated bindings as well, to help the end user of the API.

If there's a typo or something confusing in a comment and it gets fixed, we'd want that change to be propagated out. This kind of change would be a perfect case for a patch release.

sbc100 commented 2 years ago

One example point that I remember from those discussions is that the notion of bugfix might make sense for an implementation but doesn't make sense for an API itself.

We discussed this in today's meeting, but I want to make sure it's reflected here for those who couldn't attend.

This is one of the things that has evolved since the start of WASI. With the tool that Alex has been working on, we can generate language bindings for APIs. The current thinking is that this tool would use the comments from the interface definition as comments in the generated bindings as well, to help the end user of the API.

If there's a typo or something confusing in a comment and it gets fixed, we'd want that change to be propagated out. This kind of change would be a perfect case for a patch release.

Sure, I can see that could make sense for the textual description of an interface. But I don't think it would make sense in the binary dependencies at the Wasm module level, right?

The fact that bug fixes do make a lot of sense when talking about interfaces was just part of the decision to avoid semantic versioning for WASI interfaces. If you are re-proposing semantic versioning for WASI interfaces here I think we should go back and revisit the previous documents/discussions.

linclark commented 2 years ago

Sure, I can see that could make sense for the textual description of an interface. But I don't think it would make sense in the binary dependencies at the Wasm module level, right?

Let me make sure I understand what the issue here is. The snapshots (which are the artifacts that result from the process that I'm suggesting we replace) use the textual representation of the interface.

Are you suggesting that we should version binaries separately from the textual representation? Or that we should only version binaries? Or is the suggestion something different?

The fact that bug fixes do make a lot of sense when talking about interfaces was just part of the decision to avoid semantic versioning for WASI interfaces. If you are re-proposing semantic versioning for WASI interfaces here I think we should go back and revisit the previous documents/discussions.

I could only find two previous discussions that went into any depth:

  1. The meeting notes from the 05/16/2019 meeting
  2. Your document from June 2019, where you outline a case against semver

Please let me know if I'm missing any previous discussions.

Summary of 05/16/2019 meeting notes

From the notes, the outcome from the meeting seemed to simply be "we need a user story or a set of scenarios", rather than anyone really objecting to using semver.

Summary of your document

I'm going to try to summarize the points from your document here, but please do let me know if my interpretation is incorrect.

Your document did lay out a position against semver, but if I understand the argument correctly, your document also argues against the idea that we would version WASI, not just against semver:

Here we argue that such complexity might not be needed in order to meet the versioning requirements of WASI modules. What is more we argue that it may be possible to avoid the need for version numbers completely.

The document then outlines why we don't need major, minor or patch versions.

We already talked about patch versions above, so I'll stick to major and minor.

Minor versions

I believe this is the main point from your discussion of minor versions:

Most such backwards compatible changes will happen by adding new functions to an existing interface. In this case, an application can express a dependency on a newer version by simply importing the new function. No need for a version here. A soft dependency can be expressed by weakly importing the newer function.

I'm assuming that by "weakly importing" you were referring to optional imports. Correct me if I'm wrong.

In that case, if the end user developer wants to use a new method that has been added to an API, wouldn't we be requiring that developer to wrap every use of a new method with what's basically a feature detection check?

For example, this is the C code snippet we currently have in the Optional Imports doc:

if (__wasm_is_present(wasm_fs.statvfs)) {
  wasm_fs.statvfs(...)
}

I think optional imports are a great feature to have. However, if we require every single new method to be an optional import, that feels heavy weight and confusing to me. But please let me know if that's not what you were suggesting as the solution.

Major versions

I believe this is the main point from your discussion of major versions:

Users of the initial version of “filesystem” could simply import “filesystem”, and only later when we are forced to break compatibility would be need to introduce “filesystem/v2”.

I understand this point, so much so that it's also reflected in this proposal.

The proposed alternative is as follows:

Some APIs may require backwards-incompatible changes over time. In these cases, we allow proposals to increment the major version number only if the old API can be implemented in terms of the new API. As part of the new version, champions are expected to provide a tool that enables this backwards-compatibility. If that is not possible, then a new API proposal with a new name should be started. The original API can then be deprecated over time if it makes sense to do so.

Let's say a champion wants to correct a single poor design decision in the API. With this approach, champions don't need to start an entirely new project to do that. As long as backwards compat can be maintained easily (using the champion-provided tool), then the champion can continue to develop the API using the same name and repo. This feels like a solid win to me.

Do you believe there are additional concerns that weren't surfaced here?

sbc100 commented 2 years ago

Sure, I can see that could make sense for the textual description of an interface. But I don't think it would make sense in the binary dependencies at the Wasm module level, right?

Let me make sure I understand what the issue here is. The snapshots (which are the artifacts that result from the process that I'm suggesting we replace) use the textual representation of the interface.

Are you suggesting that we should version binaries separately from the textual representation? Or that we should only version binaries? Or is the suggestion something different?

I think I'm just checking my assumption that the patch version would not make sense in the name of an import. For example it would not make sense to import a functions from filesystem/v1.0.1.

The fact that bug fixes do make a lot of sense when talking about interfaces was just part of the decision to avoid semantic versioning for WASI interfaces. If you are re-proposing semantic versioning for WASI interfaces here I think we should go back and revisit the previous documents/discussions.

I could only find two previous discussions that went into any depth:

Thanks for going back digging through all this, much appreciated.

In that case, if the end user developer wants to use a new method that has been added to an API, wouldn't we be requiring that developer to wrap every use of a new method with what's basically a feature detection check?

For example, this is the C code snippet we currently have in the Optional Imports doc:

if (__wasm_is_present(wasm_fs.statvfs)) {
  wasm_fs.statvfs(...)
}

This construct would only be needed for a program that wanted to be usable with both the old and new versions of an interface at the same time. Programs that simply want to depend on the new version would not need that check, or the optional import.

Major versions

I believe this is the main point from your discussion of major versions:

Users of the initial version of “filesystem” could simply import “filesystem”, and only later when we are forced to break compatibility would be need to introduce “filesystem/v2”.

I understand this point, so much so that it's also reflected in this proposal.

The proposed alternative is as follows:

Some APIs may require backwards-incompatible changes over time. In these cases, we allow proposals to increment the major version number only if the old API can be implemented in terms of the new API. As part of the new version, champions are expected to provide a tool that enables this backwards-compatibility. If that is not possible, then a new API proposal with a new name should be started. The original API can then be deprecated over time if it makes sense to do so.

Let's say a champion wants to correct a single poor design decision in the API. With this approach, champions don't need to start an entirely new project to do that. As long as backwards compat can be maintained easily (using the champion-provided tool), then the champion can continue to develop the API using the same name and repo. This feels like a solid win to me.

Do you believe there are additional concerns that weren't surfaced here?

I'm not quite sure what you mean by champion provided tool. Would that be tools that transform source code, or binaries perhaps? I guess they transform the programs from ones that import version X of an API to ones that import version Y? In that case it seems like X and Y would be the major version only? i.e. the tool would only be needed in the case of backwards-incompatible changes which are by definition major version changes?

I think this means only the major part of the version would be in the name of the imported module?

Assuming I understand correctly then I think we are on the same page.

linclark commented 2 years ago

I think I'm just checking my assumption that the patch version would not make sense in the name of an import. For example it would not make sense to import a functions from filesystem/v1.0.1.

Ah, yes, you are correct.

I expect that we’d want to use semver ranges in imports for expressing what can satisfy a dependency. The three semver ranges I can see being useful are (syntax TBD):

  1. Specifying the major version range, e.g. filesystem/2
  2. Specifying the minimum major version, e.g. filesystem/>2
  3. Specifying the minimum minor version within a major range, e.g. filesystem/^1.3

This construct would only be needed for a program that wanted to be usable with both the old and new versions of an interface at the same time. Programs that simply want to depend on the new version would not need that check, or the optional import.

Let me use an example to make sure I understand.

Let’s say that wasi-clock gets a new function over time:

A component author uses monotonic_clock_now in their code. Then they try to run the code in an engine that has not yet implemented monotonic_clock_now.

This would result in a link-time error, telling the author that the monotonic_clock_now isn’t available in the wasi-clock API.

Is this what you were envisioning?

If so, I’d like to suggest a way that semver could improve the developer experience.

If we use semver ranges, as described above, then we could determine earlier in the process that the given engine does not fulfill the requirements. Additionally, we could provide a clearer error message (that the engine doesn’t support the required version of the API), which would help the developer debug the issue more quickly.

I'm not quite sure what you mean by champion provided tool. Would that be tools that transform source code, or binaries perhaps?

The current idea is more about virtualization than transform. The champion would create a Wasm module that virtualizes API v1, but that imports API v2.

This way, when the engine maintainers are updating their engine to support API v2, they can simply add the virtualized module for API v1 support to their runtime environment. This means that they don’t have to maintain multiple native versions of the API. They can choose to maintain both versions natively if they like, but it’s not required—they can get support without any additional burden by using this champion-provided virtualization in Wasm.

sbc100 commented 2 years ago

This would result in a link-time error, telling the author that the monotonic_clock_now isn’t available in the wasi-clock API.

Is this what you were envisioning?

Yes, exactly.

If so, I’d like to suggest a way that semver could improve the developer experience.

If we use semver ranges, as described above, then we could determine earlier in the process that the given engine does not fulfill the requirements. Additionally, we could provide a clearer error message (that the engine doesn’t support the required version of the API), which would help the developer debug the issue more quickly.

I'm not sure how that would allow us to determine anything earlier. Isn't it the same in both cases? i.e. the moment when the a module is analyzed by the runtime to see if it can fulfill the dependencies?

I'm not necessarily disagreeing that it could improve error reporting, but what do you see would be the benefits of seeing a version number in the error message? i.e. would it really be more useful to see:

error: wasi-clock-v2/monotonic_clock_now is not implemented

than:

error: wasi-clock/monotonic_clock_now is not implemented

(In the latter case the new symbol was injected into the existing (v1) API without changing the namespace.. i.e. a backwards compatible addition).

Would you envisage a developer who wants to use monotonic_clock_now switching all the other imports to the v2 module too? What about if one whats to continue to run on engines that have not yet implemented monotonic_clock_now.. this would involve weakly importing the new v2/monotonic_clock_now while strongly importing all the v1 symbols from the v1 namespace? Is that what you are proposing?

linclark commented 2 years ago

I'm not sure how that would allow us to determine anything earlier. Isn't it the same in both cases? i.e. the moment when the a module is analyzed by the runtime to see if it can fulfill the dependencies?

An idea that's been popping up in multiple conversations is to have "profiles" that express what APIs a host supports. For example, a CLI might have a profile that includes wasi-filesystem but not wasi-http. In contrast, an IoT device might do the reverse—support wasi-http but not support wasi-filesystem.

A host could publish this profile somewhere, and then you could run a comparison between what your app requires and what the host provides. This wouldn't require you manually testing it out. And in a number of scenarios, not having to manually test it out has a significant benefit: you don't have to purchase a device or set up an account to determine whether your app will work on the target platform. You just run a simple, static check between an automatically generated app manifest and the platform profile.

So given this, we could provide the user an error message as follows (before the user invests in the platform):

Your application requires wasi-clock v2. This platform only supports wasi-clock v1.4 and below.

What about if one whats to continue to run on engines that have not yet implemented monotonic_clock_now.. this would involve weakly importing the new v2/monotonic_clock_now while strongly importing all the v1 symbols from the v1 namespace?

Yes, in the case where you want to optionally use v2 functions, but also run the same code in environments that don't yet support v2, you would express a hard dependency on v1, and then optionally import v2 functions.

sbc100 commented 2 years ago
Your application requires wasi-clock v2. This platform only supports wasi-clock v1.4 and below.

Would Your application requires 'wasi-clock:monotonic_clock_now', but this platform does not support this API not be equally useful? Perhaps it could be seen as more actionable because it states explicitly the API which is lacking and gives the developer a chance to make that particular import optional?

I'm mostly asking about this because I was thinking it would be advantageous to only modify the import names when backwards-incompatible changes are made. I was thinking that for backwards compatible changes it might be nice to be able to use single evolving API implementation.

sbc100 commented 2 years ago
  1. Specifying the major version range, e.g. filesystem/2
  2. Specifying the minimum major version, e.g. filesystem/>2
  3. Specifying the minimum minor version within a major range, e.g. filesystem/^1.3

How would (2) and (3) actually be used though?

Regarding (2): If major versions are not backwards compatible then how can an application say that it will be compatible with versions in the future?

Regarding (3): Same thing, but also why would this ever be better than simply importing the exact minimum version one needs? These are not implementations being imported here, but APIs, so I'm not sure it makes sense to specify these kind of flexible ranges like it does for something like npm module.

linclark commented 2 years ago

Would Your application requires 'wasi-clock:monotonic_clock_now', but this platform does not support this API not be equally useful?

As an application developer, if I saw the message that you suggest, I'd need to figure out: Does this platform support a version of wasi-clock at all? Or have they just not implemented the version of wasi-clock that has monotonic_clock_now?

That feels like unnecessary cognitive load for the application developer, and as far as I can tell we don't gain anything in exchange for that cognitive load (as I explain at the end).

Perhaps it could be seen as more actionable because it states explicitly the API which is lacking and gives the developer a chance to make that particular import optional?

With the approach I'm suggesting, we could still surface which specific functions the application uses that are not supported by the engine. I just happened to not include it in my proposed error. And we could surface that information ahead-of-time, without the user having to manually test the engine. So I don't think this is an argument against using semver.

Regarding (2): If major versions are not backwards compatible then how can an application say that it will be compatible with versions in the future?

Fair point, this range probably isn't that useful.

Regarding (3): Same thing, but also why would this ever be better than simply importing the exact minimum version one needs? These are not implementations being imported here, but APIs, so I'm not sure it makes sense to specify these kind of flexible ranges like it does for something like npm module.

Using a range like this is better than specifying an exact dependency because it solves the exact problem you pointed out: it means you only modify the import names when backwards-incompatible changes are made.

If I say I depend on filesystem/^1.3 and the engine supplies filesystem/1.6, I can still use it because it is a semver match. I didn't need to update any import names.

sbc100 commented 2 years ago

If I say I depend on filesystem/^1.3 and the engine supplies filesystem/1.6, I can still use it because it is a semver match. I didn't need to update any import names.

If minor versions are require backwards compatible then wouldn't filesystem/1.6 always be suitable for a module that asks for filesystem/1.3 .. i.e. couldn't the caret always be inferred?

sbc100 commented 2 years ago

With the approach I'm suggesting, we could still surface which specific functions the application uses that are not supported by the engine. I just happened to not include it in my proposed error. And we could surface that information ahead-of-time, without the user having to manually test the engine. So I don't think this is an argument against using semver.

I think we could the ahead-of-time checks in either case (as you say you can just export a profile of what a given platform supports, which should work either way and sounds like a useful feature).

linclark commented 2 years ago

If minor versions are require backwards compatible then wouldn't filesystem/1.6 always be suitable for a module that asks for filesystem/1.3 .. i.e. couldn't the caret always be inferred?

If we were going by npm style semver resolution, filesystem/1.3 would only match 1.3.x releases. However, we don't need to stick to npm style resolution. Since the details of our notation in this case don't seem to block a decision, I recommend we address that in a follow up issue.

I think we could the ahead-of-time checks in either case (as you say you can just export a profile of what a given platform supports, which should work either way and sounds like a useful feature).

If an engine specifies what interface versions it supports, then it's easy to generate the profile—you have a machine readable definition of all of the functions it commits to providing.

But if you don't have that information, how do you generate the profile?

sbc100 commented 2 years ago

If an engine specifies what interface versions it supports, then it's easy to generate the profile—you have a machine readable definition of all of the functions it commits to providing.

But if you don't have that information, how do you generate the profile?

Whatever system we design it should be possible for a given runtime to declare exactly what it supports in machine readable way. I don't see how the points we are discussing here (regarding how and where to include version information in the imports names) here effect the ability to this.

sbc100 commented 2 years ago

If minor versions are require backwards compatible then wouldn't filesystem/1.6 always be suitable for a module that asks for filesystem/1.3 .. i.e. couldn't the caret always be inferred?

If we were going by npm style semver resolution, filesystem/1.3 would only match 1.3.x releases. However, we don't need to stick to npm style resolution. Since the details of our notation in this case don't seem to block a decision, I recommend we address that in a follow up issue.

But if 1.5 and 1.6 are (by strict definition) compatible with filesystem/1.3 when why make the distinction between 1.3 and ^1.3 since any server that provides 1.5 or 1.6 will implicitly support 1.3, right? In other words, why include support for the caret range thing at all? It seems to complicate matters needlessly (and encourages us to confuse our API versioning with npm module versioning).

sbc100 commented 2 years ago

We can look at the caret thing from the other perspective too: If I was to leave out the caret on my 1.3 .. its kind like saying I am not compatible with 1.4... but isn't that logically impossible if 1.4 is by definition compatible with 1.3?

Again, unlike npm modules we are not importing implementations so its not possible, for example, for there to be a bug in 1.4 and above that I want to avoid by pinning to 1.3.

sbc100 commented 2 years ago

Perhaps we are getting to much into the weeds here. I don't disagree with most of what in this proposal, but the use semver I think we should consider carefully before adopting.

lukewagner commented 2 years ago

Could a valid high-level summary be:

?

sbc100 commented 2 years ago

Thanks Luke, that sounds like a good summary!

I would add that I think it would be great if not only ^ is inferred but also that we should be able to completely avoid semver ranges in import names.. imports would only ever need to be precise (major, minor) tuples (the engine would know that major+N was never suitable/compatible and minor+N was always suitable/compatible).

linclark commented 2 years ago

Great, thanks for that summary, Luke.

Given that, I'll add the vote to next week's meeting. I'll make sure that the vote is clearly scoped to only this proposal (which, to recap, is to adopt a release process that versions interfaces separately and uses major/minor/patch tuples to express the versions). We can then discuss the details of resolution in a subsequent issue.

syrusakbary commented 2 years ago

Regarding semantic versioning in particular, there was quite a lot of discussion around this early on I think we decided it didn't make sense to use this.

Here's the previous discussion: #360

linclark commented 2 years ago

That is one previous discussion, but it doesn't really talk about semver extensively. Frank asks whether semver was ruled out, Pat says semver only makes sense once we're getting close to stable, and then Sam links to a previous discussion.

Since this issue was decided by a consensus poll (in favor) in the meeting on November 4, I will close this out now.