Closed poorna2152 closed 2 years ago
AssemblyScript does not provide support for any
or union
types.
However we can do this in AssemblyScript.
export function add32(x: i32, y: i32): i32 {
return add(x, y);
}
export function add64(x: i64, y: i64): i64 {
return add(x, y);
}
export function add<T>(a: T, b: T): T {
return a + b;
}
The different types of T
that the add
can be called with need to be known at the compile time. So it can create functions for each of those types.
(func $add<i64> (param $0 i64) (param $1 i64) (result i64)
local.get $0
local.get $1
i64.add
)
(func $add<i32> (param $0 i32) (param $1 i32) (result i32)
local.get $0
local.get $1
i32.add
)
Assignability: https://www.assemblyscript.org/types.html#assignability This too seems like a compile time feature
Javascript can be compiled to WASM
using nectar js. This produces a WASM file and when that WASM
was converted to WAT
it did not provide useful information on how the types were handled.
Current implementation:
any
.Example:
import ballerina/io;
public function main() {
foo(57); // @output 57
foo(()); // @output
foo(9223372036854775807); // @output 9223372036854775807
}
function foo(any x) {
io:println(x);
}
(module
(import "console" "log" (func $println (param i64)))
(memory $0 1 256)
(global $offset (mut i32) (i32.const 0))
(export "memory" (memory $0))
(export "main" (func $main))
(export "foo" (func $foo))
(func $main
(local $0 i64)
(local $1 i64)
(local $2 i64)
(block
(call $foo
(call $int_to_tagged
(i64.const 57)))
(call $foo
(i64.const 2305843009213693952))
(call $foo
(call $int_to_tagged
(i64.const 9223372036854775807)))
(return)))
(func $foo (param $0 i64)
(local $1 i64)
(block
(call $println
(local.get $0))
(return)))
(func $int_to_tagged (param $0 i64) (result i64)
(local $1 i32)
(if
(i32.and
(i64.gt_s
(local.get $0)
(i64.const -36028797018963968))
(i64.lt_s
(local.get $0)
(i64.const 36028797018963967)))
(return
(i64.or
(i64.or
(i64.and
(local.get $0)
(i64.const 72057594037927935))
(i64.const 2305843009213693952))
(i64.const 504403158265495552)))
(block
(i64.store
(global.get $offset)
(local.get $0))
(local.set $1
(global.get $offset))
(global.set $offset
(i32.add
(global.get $offset)
(i32.const 8)))
(return
(i64.or
(i64.const 504403158265495552)
(i64.extend_i32_u
(local.get $1))))))))
println
definition in Javascript
:
console: {
log: function(arg) {
if (Number(arg & IMMEDIATE_FLAG) != 0) {
console.log(getImmediateValue(arg).toString());
}
else {
let loc = Number(arg & ((2n**32n) - 1n));
let x = (BigInt(memory[2*loc + 1]) << 32n) | (BigInt(memory[2*loc]));
console.log(x.toString())
}
}
},
Possible method:
int_to_tagged
and tagged_to_int
function in WASM.tagged_to_int
function from WASM and call that function in the println
function.From discussion: It is okay to allocate space in memory if WASM gc takes care of it. Else we have to do gc ourselves or find a better way of representing any and unions.
It is okay to allocate space in memory if WASM gc takes care of it.
It will not be GCed by the WASM engine.
Else we have to do gc ourselves
We prefer not to do this.
Plan is to look at https://github.com/WebAssembly/gc and fine a strategy that will work when gc is available in WASM engine. We'll also have to see if there is a reference impl of this if not we have to find/impl a polyfill scheme for us to continue development.
Please take a look at https://github.com/WebAssembly/gc/issues/130#issuecomment-1029368340
From what I understood,
Struct
which is a type in WASM is garbage collected.
https://github.com/WebAssembly/gc/blob/main/proposals/gc/Overview.md#structuresi31
in WASM does not necessarily mean it is garbage collected. If the value can be represented in 31 bits it can be stored in a i31ref
otherwise it is stored in a struct
as per above mentioned link. i31ref
is introduced to distinguish immediate values and non-immediate ones as a feature for untyped languages.
https://github.com/WebAssembly/gc/blob/main/proposals/gc/Overview.md#unboxed-scalarsstruct
is which guarantees the Garbage collection.Therefore my initial idea was to,
Use the tagged representation as of nBallerina and when the value need to be Boxed (e.g.: integer which cannot be represented in 56 bits) then store it in a struct
and store the struct
in memory. Here any
in Ballerina would map to a i64
type in WASM whose value is either an immediate value or a pointer to memory. Since struct is GC, this would be a solution to the problem. But it seems like a struct cannot be stored in memory. (No instruction to store a struct in memory).
The other option we can do is to use the i31
and struct
type. Tagging functions would convert an i64
to either a i31
or struct
. Both i31
and struct
types are subtypes of ref
type. When the value we want to convert can be represented using 31 bits or less then we can use the i31 or else we need to use the struct type. Thus any
in Ballerina would map to a ref
type in WASM.
UUIWTTTT
: 4bits for type, one for immutable/non-immutable, one for immediate.So out of from the 31 bits we have,
ref
variable is i31
it is essentially immediate.Immediate types: Nil, Boolean, Int (Both immediate and not)
String => Not Immediate (Corresponding ASCII values in UTF-8 can be represented by a byte. But other values require 2-4 bytes)
Thus require at least 2 bits to represent type.
If 2 bits for the type then we have 29 bits for storing the immediate value. Rather storing 16 bit integers as immediates using all the 29 bits would increase the range of the value which can be immediately stored. This would reduce the Boxing and Unboxing time for the values.
As suggested representing Booleans
with i31ref
, ints
with structs
and null
with null ref
.
For the following program,
import ballerina/io;
public function main() {
any x = true;
any y = 21;
any z = ();
io:println(x);
io:println(y);
io:println(z);
}
WAT
(module
(type $BoxedInt (struct (field $val i64)))
(import "console" "log" (func $println (param anyref)))
(export "main" (func $main))
(export "tagged_to_int" (func $tagged_to_int))
(export "tagged_to_boolean" (func $tagged_to_boolean))
(export "get_type" (func $get_type))
(func $main
(local $0 anyref)
(local $1 anyref)
(local $2 anyref)
(block
(local.set $0
(call $boolean_to_tagged
(i32.const 1)))
(local.set $1
(call $int_to_tagged
(i64.const 21)))
(call $println
(local.get $0))
(call $println
(local.get $1))
(call $println
(local.get $2))
(return)))
(func $int_to_tagged (param $0 i64) (result (ref $BoxedInt))
(return
(struct.new_with_rtt
$BoxedInt
(local.get $0)
(rtt.canon $BoxedInt))))
(func $tagged_to_int (param $0 anyref) (result i64)
(return
(struct.get
$BoxedInt
$val
(ref.cast
(ref.as_data
(local.get $0))
(rtt.canon $BoxedInt)))))
(func $boolean_to_tagged (param $0 i32) (result i31ref)
(return
(i31.new
(local.get $0))))
(func $tagged_to_boolean (param $0 anyref) (result i32)
(return
(i31.get_u
(ref.as_i31
(local.get $0)))))
(func $get_type (param $0 anyref) (result i32)
(if
(ref.is_i31
(local.get $0))
(return
(i32.const 1))
(if
(ref.is_null
(local.get $0))
(return
(i32.const 0))
(return
(i32.const 2))))))
Using null ref
to represent null
would mean that corresponding variable will not be initialized. (In the above program $2
register is not initialized).Shouldn't it be preferred to have a separate value which represents null. (i31ref value for null).
any
in subset 2 represents either aninteger
or aboolean
. Boolean is represented as ani32
inWASM
andInteger
as ani64
. Thusany
should map to ani64
. Whenany
is assigned aboolean
it should be converted to ani64
.i64.extend_i32_u
can be used to extend an unsignedi32
toi64
.