⚠️ Migrated to https://github.com/bytecodealliance/wasm-tools/tree/main/crates/wasm-wave ⚠️ |
---|
WAVE is a human-oriented text encoding of WebAssembly Component Model values. It is designed to be consistent with the WIT IDL format.
Type | Example Values |
---|---|
Bools | true , false |
Integers | 123 , -9 |
Floats | 3.14 , 6.022e+23 , nan , -inf |
Chars | 'x' , '☃︎' , '\'' , '\u{0}' |
Strings | "abc\t123" |
Tuples | ("abc", 123) |
Lists | [1, 2, 3] |
Records | {field-a: 1, field-b: "two"} |
Variants | days(30) , forever |
Enums | south , west |
Options | "flat some" , some("explicit some") , none |
Results | "flat ok" , ok("explicit ok") , err("oops") |
Flags | {read, write} , {} |
use wasmtime::component::{Type, Val};
let val: Val = wasm_wave::from_str(&Type::String, "\"👋 Hello, world! 👋\"").unwrap();
println!("{}", wasm_wave::to_string(&val).unwrap());
→ "👋 Hello, world! 👋"
Values are encoded as Unicode text. UTF-8 should be used wherever an interoperable binary encoding is required.
Whitespace is insignificant between tokens and significant within tokens: keywords, labels, chars, and strings.
Comments start with //
and run to the end of the line.
Several tokens are reserved WAVE keywords: true
, false
, inf
, nan
, some
, none
, ok
, err
. Variant or enum cases that match one of these keywords must be prefixed with %
.
Kebab-case labels are used for record fields, variant cases, enum cases, and flags. Labels use ASCII alphanumeric characters and hyphens, following the Wit identifier syntax:
one
, two-words
q
, abc123
HTTP3
, method-GET
%
; this is not part of the label itself but allows for representing labels that would otherwise be parsed as keywords.
%err
, %non-keyword
Bools are encoded as one of the keywords false
or true
.
Integers are encoded as base-10 numbers.
TBD: hex/bin repr? e.g.
0xab
,0b1011
Floats are encoded as JSON numbers or one of the keywords nan
, (not a number) inf
(infinity), or -inf
(negative infinity).
Chars are encoded as '<char>'
, where <char>
is one of:
\'
→ '
\"
→ "
\\
→ \
\t
→ U+9 (HT)\n
→ U+A (LF)\r
→ U+D (CR)\u{···}
→ U+··· (where ···
is a hexadecimal Unicode Scalar Value)Escaping newline (\n
), \
, and '
is mandatory for chars.
Strings are encoded as a double-quote-delimited sequence of <char>
s (as for Chars).
Escaping newline (\n
), \
, and "
is mandatory for strings.
A multiline string begins with """
followed immediately by a line break (\n
or \r\n
) and ends with a line break, zero or more spaces, and """
. The number of spaces immediately preceding the ending """
determines the indent level of the entire multiline string. Every other line break in the string must be followed by at least this many spaces which are then omitted ("dedented") from the decoded string.
Each line break in the encoded string except for the first and last is decoded as a newline character (\n
).
Escaping \
is mandatory for multiline strings. Escaping carriage return (\r
) is mandatory immediately before a literal newline character (\n
) if it is to be retained. Escaping "
is mandatory where necessary to break up any sequence of """
within a string, even if the first "
is escaped (i.e. \"""
is prohibited).
"""
A single line
"""
→ "A single line"
"""
Indentation determined
by ending delimiter
"""
→
" Indentation determined\n by ending delimiter"
"""
Must escape carriage return at end of line: \r
Must break up double quote triplets: ""\""
"""
→
"Must escape carriage return at end of line: \r\nMust break up double quote triplets: \"\"\"\""
Tuples are encoded as a parenthesized sequence of comma-separated values. Trailing commas are permitted.
tuple<u8, string>
→ (123, "abc")
Lists are encoded as a square-braketed sequence of comma-separated values. Trailing commas are permitted.
list<char>
→ []
, ['a', 'b', 'c']
Records are encoded as curly-braced set of comma-separated record entries. Trailing commas are permitted. Each record entry consists of a field label, a colon, and a value. Fields may be present in any order. Record entries with the option
-typed value none
may be omitted; if all of a record's fields are omitted in this way the "empty" record must be encoded as {:}
(to disambiguate from an empty flags
value).
record example {
must-have: u8,
optional: option<u8>,
}
→ {must-have: 123}
= {must-have: 123, optional: none,}
record all-optional {
optional: option<u8>,
}
→ {:}
= {optional: none}
Note: Field labels may be prefixed with
%
but this is never required.
Variants are encoded as a case label. If the case has a payload, the label is followed by the parenthesized case payload value.
If a variant case matches a WAVE keyword it must be prefixed with %
.
variant response {
empty,
body(list<u8>),
err(string),
}
→ empty
, body([79, 75])
, %err("oops")
Enums are encoded as a case label.
If an enum case matches a WAVE keyword it must be prefixed with %
.
enum status { ok, not-found }
→ %ok
, not-found
Options may be encoded in their variant form (e.g. some(...)
or none
). A some
value may also be encoded as the "flat" payload value itself, but only if the payload is not an option or result type.
option<u8>
→ 123
= some(123)
Results may be encoded in their variant form (e.g. ok(...)
, err("oops")
). An ok
value may also be encoded as the "flat" payload value itself, but only if it has a payload which is not an option or result type.
result<u8>
→ 123
= ok(123)
result<_, string>
→ ok
, err("oops")
result
→ ok
, err
Flags are encoded as a curly-braced set of comma-separated flag labels in any order. Trailing commas are permitted.
flags perms { read, write, exec }
→ {write, read,}
Note: Flags may be prefixed with
%
but this is never required.TBD: Allow record form?
{read: true, write: true, exec: false}
TBD (
<named-type>(<idx>)
?)
Some applications may benefit from a standard way to encode function calls and/or results, described here.
Function calls can be encoded as some application-dependent function identifier (such as a kebab-case label) followed by parenthesized function arguments.
If function results need to be encoded along with the call, they can be separated from the call by ->
.
my-func("param")
with-result() -> ok("result")
Arguments are encoded as a sequence of comma-separated values.
Any number of option
none
values at the end of the sequence may be omitted.
// f: func(a: option<u8>, b: option<u8>, c: option<u8>)
// all equivalent:
f(some(1))
f(some(1), none)
f(some(1), none, none)
TBD: Named-parameter encoding? e.g.
my-func(named-param: 1)
Could allow omitting "middle"none
params.
Results are encoded in one of several ways depending on the number of result values:
()
or omitted entirely.-> ()
-> some("single result")
// or
-> (0: some("single result"))
-> (result-a: "abc", result-b: 123)
:ocean: