WebAssembly / component-model

Repository for design and specification of the Component Model
Other
899 stars 75 forks source link

Why are parameter names in component functions necessary? #311

Closed oovm closed 4 months ago

oovm commented 4 months ago

There is something I don’t understand when I look at the function definition.

Why do the parameter names of component functions have to be exactly the same?

They cannot be shortened or omitted.

functype      ::= (func <paramlist> <resultlist>)
paramlist     ::= (param "<label>" <valtype>)*
resultlist    ::= (result "<label>" <valtype>)*
                | (result <valtype>)

This is very different from the core function. The parameter names of the wasm function are not very important.


If a language does not have optional parameters, named parameters, overloading, etc., parameter names should be irrelevant.

I think that for a function in an abstract interface, as long as the positional type of each parameter is correct.

Some languages such as rust are set up this way.

struct WasiInstance {}

pub trait WasiInterface {
    fn comp_function(&self, a: i32) -> String;
}
impl WasiInterface for WasiInstance {
    fn comp_function(&self, b: i32) -> String {
        format!("comp_function: {}", b)
    }
}
tschneidereit commented 4 months ago

The reason for this is that component interfaces are meant to work as well as possible for all languages, and importantly for combinations between all languages.

As you say, for some languages parameter names are very important, so if names were treated as optional in components, bindings for those languages would necessarily be quite a bit worse.

Another example of the same reasoning is the syntax for resources, with constructors and methods that get an implicit self parameter. Languages like C can't make much use of this: bindings for them could be about as good with only having functions in the interface. But for other languages like Rust, C++, JS, etc the bindings would be much worse.

oovm commented 4 months ago

I still think this restriction should be relaxed at least at the abi level

In short, for the sake of reducing the size, I still hope not to check whether the function parameter names are consistent.

oovm commented 4 months ago

In addition, as a language developer, I hope to use the wasi interface as a language standard library in my language.

However, there are many contributors to the wasi interface, and their naming styles and tastes are not consistent. This will lead to very inconsistent naming style in my language standard library.


For example, here is field-size, that's ok

https://github.com/WebAssembly/WASI/blob/889e8bd89c2c73912a223f0a60ad711c46e38d3e/preview2/http/types.wit#L93

here is filesize, not file-size, the style is inconsistent.

https://github.com/WebAssembly/WASI/blob/889e8bd89c2c73912a223f0a60ad711c46e38d3e/preview2/filesystem/types.wit#L361

For example, here is length, that's ok.

https://github.com/WebAssembly/WASI/blob/889e8bd89c2c73912a223f0a60ad711c46e38d3e/preview2/filesystem/types.wit#L386-L391

Here it becomes len.

https://github.com/WebAssembly/WASI/blob/889e8bd89c2c73912a223f0a60ad711c46e38d3e/preview2/random/random.wit#L19


Some languages just like to use len, and some languages just like to use length. I'm not arguing which one is better, I just hope that at least the style can be unified.

From the perspective of naming style, I also hope that there will be no need to check names, so that each language can change the naming to suit its own style.

tschneidereit commented 4 months ago
  • The naming style is not uniform, there are C-style ultra-short abbreviations, and there are also full names that are longer than object-c names.

You are right, there are limits to how much we can do to ensure that bindings are idiomatic. That's not an argument for giving up on the idea as a whole.

* It is difficult to ensure that all parameter names do not touch keywords in all languages ​​that wish to support wasi.

They don't have to: bindings generators can mangle names that would conflict with the language, and they already do.

* Not conducive to merging function signatures
  * For example, a small function with only one or two parameters could refer to the same definition

Yes, there is necessary overhead in this that is not needed for all languages. That's the cost of building an abstraction that is meant to work across languages.

* The parameter names in the function are not really used

  * At least jco and wit-bindgen will generate a large number of trampoline functions such as arg0, arg1, etc., without the behavior of reading the parameter name field in the function

The trampoline functions are internal details of the bindings and not meant to be part of the public interface. Jco and wit-bindgen very much use the names where it matters: in the API exposed to users of the bindings.

In short, for the sake of reducing the size, I still hope not to check whether the function parameter names are consistent.

From the perspective of naming style, I also hope that there will be no need to check names, so that each language can change the naming to suit its own style.

Apart from everything else, I don't even know what this would look like. Are you imagining that users of a component would as the very first step read through the entire component and decide on names for all parameters? Or how else would this decision be made?

oovm commented 4 months ago

If all parameter names are erased, what should the user do as the first step?

Download the original wit file.


If compared with the ts ecosystem, *.wasm is actually equivalent to *.min.js, and wit is actually equivalent to *.d.ts + *.js.source-map

If the user wants to generate lossless js bindings, (he/she) must have a complete wit definition file.

Package management will help (he/she) find the correct wit definition (according to the wit world name, similar to DefinitelyTyped in the ts ecosystem).

/*
 * comment from wit file
*/
function request(message: string, timeout: number, retry: number)

If the user downloads the extremely optimized wasm directly from the web page, then after decompilation (he/she) can only obtain:

function request(arg0: string, arg1: string, arg2: number)
oovm commented 4 months ago

Including parameter names in the abi check will also cause huge compatibility pressure.

The parameter names in the definition cannot be modified freely under the constraints of the semantic version number.

If we want to change it, it will require a very long process.

If we want to improve the overall naming consistency of the wasi interface next, then the following release process is required: ​

// wasi 1.0.0
get-random-bytes(len: u64)
// wasi 1.1.x
@deprecated
get-random-bytes(len: u64)
get-random-bytes-parameters-consistent(length: u64)
// wasi 2.0.0
get-random-bytes-parameters-consistent(length: u64)

In fact IDL should be divided into two parts

This allows more features to be added while maintaining abi stability.

For example, to support the default parameter feature, ABI does not need to be changed, because as an intersection of features, some languages do not have this feature.

wit can directly add this feature. When the user generates a language binding, if the language does have this feature, a function with default parameters will be generated.

Parameter names are actually an optional feature based on this division, therefore should not be included in abi. ​

tschneidereit commented 4 months ago

Download the original wit file.

Even if we wanted to change things to require this, I don't see how it'd help: the names in the component are the same as in the wit file, so if they don't work for you in the component, they won't work any better if taken from the wit file.

More fundamentally, it's a key goal of the component model to support using components without the original wit file.

Components aren't meant to be the low-level ABI of an API expressed in WIT which is then erased as much as possible. In fact, many development tools will go further than embedding just the parameter names by also embedding the doc comments for functions, so that bindings can include API docs just based on a component.

Same as with the focus on cross-language interoperability, I think it's out of the question that we will change this, so it might be most productive for you to think about components in this way, and not try to get these aspects changed because you have different priorities.

oovm commented 4 months ago

OK, I didn't notice that this was a lossless format, and I wouldn't object to that further.

I hope to improve the naming consistency in the next minor version, since there will be almost no chance to modify it after the official release.

In OS and many popular libraries, some spelling errors have even become conventions.

I hope to develop some spell checking tools based on wit-parser, at least under the wasi namespace, to avoid such problems. ​

tschneidereit commented 4 months ago

OK, I didn't notice that this was a lossless format, and I wouldn't object to that further.

Ah right—that is indeed a crucial aspect, and one that we should perhaps try to emphasize more.

I hope to improve the naming consistency in the next minor version, since there will be almost no chance to modify it after the official release.

In OS and many popular libraries, some spelling errors have even become conventions.

I agree that this is a downside to having the names be part of the interface. It of course would always exist for function and interface names, but Components have an even broader surface than other systems.

I just filed #312 with a proposed way to address this.

I hope to develop some spell checking tools based on wit-parser, at least under the wasi namespace, to avoid such problems.

That sounds fantastic! ❤️