WebAssembly / component-model

Repository for design and specification of the Component Model
Other
897 stars 75 forks source link

Provide values for enums/constants #365

Open vlhomutov opened 1 month ago

vlhomutov commented 1 month ago

When using WIT to describe some existing interface, it is sometimes required to have enums with specific values. But there is no way to express this in WIT currently.

Think of bitflags,for example:

we have something simple like this (C)

#define FOO  0x2
#define BAR 0x4

int  some_func(int arg) {
   if (arg == 1)  return FOO; else return BAR;
}

Or your function returns NEGATIVE constants (ERR_X = -1, ERR_Y =-33, ERR_Z = -22)

Or maybe your interface has some magic constants like MY_BETTER_PI=3.44 that needs to be passed around or may be returned from functions.

I understand that WIT is about TYPES, but plain types are just not enough to describe an interface. Because currently WIT definition is the only point where your host application can provide this data to unknown number of its clients in unknown to the host languages. Otherwise, we are back to days when we had to parse C includes to be able to use FFI.

lukewagner commented 3 weeks ago

Hi, this is a good question so thanks for filing an issue and sorry for the slow reply (last week was the in-person CG meeting)!

So coincidentally, with the recently-merged #336, I think we have a Component Model primitive that matches what you're describing. In particular, when defining an interface, in WAT I can write:

(type (instance
  (value $foo u32 2)
  (value $bar u32 4)
  (export "foo" (value (eq $foo)))
  (export "bar" (value (eq $bar)))
  (export "some_func" (func (param "arg" u32) (result u32)))
))

336 didn't add WIT-level syntax to capture this new functionality, but we should before too long, and so you could imagine that in WIT you can write something like:

interface whatever {
  foo = 2;
  bar = 4;
  some_func: func(arg: u32) -> u32;
}

Now, an obvious improvement that you might want is a more direct typed connection between foo/bar and the return-type of some_func, which might be nice for bindings generators to, e.g., scope the constants. Now of course you can get this with an enum:

interface whatever {
  enum e { foo, bar };
  some_func: func(arg: u32) -> e;
}

and a C code generator can generate #defines for foo and bar so you can get the raw integer values. But now you've lost the ability to control the specific integer values. However, if we were to allow enums to declare arbitrary integral values, that opens up a tricky question: does the CABI guarantee that only valid enum values are passed? If "no", than it's not really an enum, it's really just some integer type with any allowable integer value. If "yes", then the CABI has to do some possibly-expensive validation logic and also there are legitimate reasons for an interface author to intentionally allow more values than just those explicitly declared. While I can imagine a new type that works like an integer with associated constant values, I wonder if it's worth it when it seems like the above value definitions (which are scoped, btw, by the interface) is Good Enough, at least for the time being.

rossberg commented 3 weeks ago

I would also argue that integer mappings of enum values are not an interface concern but one of lowering to a specific language, and only for those languages that actually expose enums as ints in the first place.

lukewagner commented 3 weeks ago

Yeah, I think that's why we should consider this entirely separate from enums; if there is a new feature to consider adding here, it's some sort of specialization of an integer with a named subset of values.

vlhomutov commented 3 weeks ago

I agree that if you set values in enums, it is not a enum longer and they should not be extended this way. Still we need some way to set named values in the interface description, along with the type. How this will be translated to target language is a task for the specific generator. Obvious candidates are named constants. People may want to define versions, dates, cryptographic values, and other various numbers that are required to use the declared interface. Not sure though how far this can go - we start with numeric constants and at some day we may end up with arrays, records and objects with constructors. Maybe this should be different language that describes data.

lukewagner commented 3 weeks ago

Did the middle code snippet in my first comment make sense, or were you imagining something else?

vlhomutov commented 3 weeks ago

Did the middle code snippet in my first comment make sense, or were you imagining something else?

Yes, absolutely. Maybe even "u32 bar: 42", so we have control over types sizes.

lukewagner commented 3 weeks ago

Oops, yes, you're right; the type would be necessary as part of the value definition in the WIT too.