TianlongLiang commented 9 months ago

Summary

Plan to implement basic support for the memory64 proposal, first focusing on the classic interpreter and AOT running modes.

Overview of Basic Support Plan

With basic support completed, users should experience the following:

Use cmake -DWAMR_BUILD_MEMORY64=1 to compile a memory64-enabled iwasm VM core. Initially, only memory64-enabled wasm/aot files will be supported. In other words, a memory64-enabled runtime VM core will only support memory64 wasm/aot files for basic support; running memory32 wasm/aot files is not possible (though you can still run memory32 wasm/aot files with the default runtime compile without adding the cmake flags). Further development will ensure compatibility of memory64-enabled runtime with both memory32 and memory64 wasm/aot files. Maybe make them compatible from the start
For the wamrc AOT compiler, no extra work is required during compile time. To compile a memory64 aot file, use wamrc --enable-memory64 -o test.aot test.wasm to generate a memory64 aot file.

PS: A warning/exception may be given if the memory64 feature is not actually used in a memory32 wasm file.

When you integrate a WAMR VMcore into your host application, you can use certain APIs that WAMR exports. These APIs help manage linear memory and enable the conversion and verification of addresses between the host's native environment and the app's linear memory. However, to accommodate memory64, these APIs will undergo modifications. The main changes will be in the types of arguments and return values. Specifically, the representation of a linear memory address will shift from uint32 to uint64. This change applies regardless of whether you enable memory64. In a memory32 scenario, this adjustment is seamless, as there is no narrow conversion involved. Instead, uint32 values are simply expanded to uint64. This modification affects the following APIs, here is a preview of their signature modifications:

/* See wasm_export.h for description */
WASM_RUNTIME_API_EXTERN uint64
wasm_runtime_module_malloc(WASMModuleInstanceCommon *module_inst, uint64 size, void **p_native_addr);

/* See wasm_export.h for description */
WASM_RUNTIME_API_EXTERN void
wasm_runtime_module_free(WASMModuleInstanceCommon *module_inst, uint64 ptr);

/* See wasm_export.h for description */
WASM_RUNTIME_API_EXTERN uint64
wasm_runtime_module_dup_data(WASMModuleInstanceCommon *module_inst, const char *src, uint64 size);

/* See wasm_export.h for description */
WASM_RUNTIME_API_EXTERN bool
wasm_runtime_validate_app_addr(WASMModuleInstanceCommon *module_inst, uint64 app_offset, uint64 size);

/* See wasm_export.h for description */
WASM_RUNTIME_API_EXTERN bool
wasm_runtime_validate_app_str_addr(WASMModuleInstanceCommon *module_inst, uint64 app_str_offset);

/* See wasm_export.h for description */
WASM_RUNTIME_API_EXTERN bool
wasm_runtime_validate_native_addr(WASMModuleInstanceCommon *module_inst, void *native_ptr, uint64 size);

/* See wasm_export.h for description */
WASM_RUNTIME_API_EXTERN void *
wasm_runtime_addr_app_to_native(WASMModuleInstanceCommon *module_inst, uint64 app_offset);

/* See wasm_export.h for description */
WASM_RUNTIME_API_EXTERN uint64
wasm_runtime_addr_native_to_app(WASMModuleInstanceCommon *module_inst, void *native_ptr);

/* See wasm_export.h for description */
WASM_RUNTIME_API_EXTERN bool
wasm_runtime_get_app_addr_range(WASMModuleInstanceCommon *module_inst, uint64 app_offset, uint64 *p_app_start_offset, uint64 *p_app_end_offset);

/* See wasm_export.h for description */
WASM_RUNTIME_API_EXTERN bool
wasm_runtime_get_native_addr_range(WASMModuleInstanceCommon *module_inst, uint8 *native_ptr, uint8 **p_native_start_addr, uint8 **p_native_end_addr);

Impact of memory64 proposal on WAMR's data structure

This section lists the changes in the memory64 proposal that could potentially influence WARM's internal data structure and code logic. Let's see what those influences could be

Memory initial/max page count: According to the memory64 proposal, when memory64 is used, the default page size remains unchanged(64KB), but the initial page count/max page count theoretically can maximally be 2^48 now, which is roughly 2.2 million terabytes. Given it's kind of surreal to use that much memory in real life, stick with the uint32 representation of page count.

To conclude, the data structure represents WASMMemory stay unchanged, the initial page count and max page count stay as u32.

PS: The maximum value can be 2^32 - 1 pages which is about 275 TB. The value exceeding the value is marked by a throw exception. It's not fully conform to spec but it's a practical design choice.
Data segment offset: for memory64, it should be should be i64 now
- In WASMDataSeg, the offset's type will become i64
- So as the offset type in AOTMemInitData

Memory instruction: The type of memarg’s offset in all memory instructions needs to be changed, as well as the parameters representing linear memory addresses taken by the memory instruction on the stack, a total of 116 opcodes.

In WASMModule and AOTModule the fields that represent linear memory address, use uint64 instead:

/* the index of auxiliary __data_end global,
  -1 means unexported */
uint32 aux_data_end_global_index;
/* auxiliary __data_end exported by wasm app */
/* TODO: type changes to uint64 */
uint32 aux_data_end;

/* the index of auxiliary __heap_base global,
  -1 means unexported */
uint32 aux_heap_base_global_index;
/* auxiliary __heap_base exported by wasm app */
/* TODO: type changes to uint64 */
uint32 aux_heap_base;

/* the index of auxiliary stack top global,
  -1 means unexported */
uint32 aux_stack_top_global_index;
/* auxiliary stack bottom resolved */
/* TODO: type changes to uint64 */
uint32 aux_stack_bottom;

Also in AOT file, those fields will be emitted as u64. So in a sense, AOT ABI will change.

In WASMMemoryInstance(AOTMemoryInstance)

/* Module type */
uint32 module_type;

/* Whether the memory is shared */
uint8 is_shared_memory;

/* TODO: changes to is_memory_64 flags */
uint8 __padding__;

/* Reference count of the memory instance:
     0: non-shared memory, > 0: shared memory */
bh_atomic_16_t ref_count;

/* Number bytes per page */
uint32 num_bytes_per_page;
/* Current page count */
uint32 cur_page_count;
/* Maximum page count */
uint32 max_page_count;
/* TODO: add new Paddings for alignment */
uint32 padding;
/* TODO: type changes to uint64 */
uint32 memory_data_size;

......other data member

memory_data_size type changes can result to AOT offset calculation logic changes(details are in create_memory_info)

In WASMExecEnv

/* TODO: boundary and bottom to uint64? */
/* Auxiliary stack boundary */
union {
    uint32 boundary;
    uintptr_t __padding__;
} aux_stack_boundary;
/* Auxiliary stack bottom */
union {
    uint32 bottom;
    uintptr_t __padding__;
} aux_stack_bottom;

Topics for Further Discussion

😄 Suggestions or input on the following topics (or others you can think of) are appreciated.

1. Should Memory64 Only Be Supported on 64-Bit Machines?

I believe it's not very useful to support the memory64 proposal on 32-bit machines, as more than 4GB of memory can't be utilized anyway. If linear memory is below 4GB, sticking with default memory32 seems sensible. Supporting memory64 on 32-bit machines would also add implementation complexity. Therefore, the initial plan is not to support the memory64 proposal on 32-bit machines.

2. Exported Runtime APIs for Host Use When Embedding WAMR VMCore

As previously mentioned, the initial plan is to add new APIs like wasm64_xxx for memory64 wasm/aot files, while keeping the old APIs unchanged for memory32 wasm/aot files.

// New APIs
WASM_RUNTIME_API_EXTERN uint64
wasm64_runtime_module_malloc(WASMModuleInstanceCommon *module_inst, uint64 size, void **p_native_addr);
// Old counterpart APIs
WASM_RUNTIME_API_EXTERN uint32
wasm_runtime_module_malloc(WASMModuleInstanceCommon *module_inst, uint32 size, void **p_native_addr);

Alternatively, we could modify existing APIs without adding new ones.

// Choice 1: use uintptr_t
WASM_RUNTIME_API_EXTERN uintptr_t
wasm_runtime_module_malloc(WASMModuleInstanceCommon *module_inst, uintptr_t size, void **p_native_addr);
// Choice 2: use uint64
WASM_RUNTIME_API_EXTERN uint64
wasm_runtime_module_malloc(WASMModuleInstanceCommon *module_inst, uint32 size, void **p_native_addr);

These two choices are worth considering. The first fits well on 32-bit platforms, and the second maintains consistent function signatures. At first glance, they seem preferable to adding new wasm64_xxx APIs. However, they could cause problems for existing programs that want to be compiled with the updated version of WAMR, which is why adding new APIs is the primary choice. Additionally, new APIs can reduce programming errors and enhance readability by clearly indicating the memory standard used for a given WASM file/module, with explicit return value types.

Plan to go with choice 2: use uint64 to modify existing APIs, keeping one set of APIs is easier for future maintenance.

3. Internal API implementation choice

Adding new export APIs or changing the export APIs requires corresponding changes in internal implementation APIs too. For example:

/* Internal API */
uint32
wasm_runtime_module_realloc_internal(WASMModuleInstanceCommon *module_inst,
                                    WASMExecEnv *exec_env, uint32 ptr,
                                    uint32 size, void **p_native_addr);

The ptr, size, and returned ptr should be uint64 when it's Memory64.

So we can change those types to:

typedef mem_offset_t = uint32/uint64; based on CMake flag
uintptr_t;
uint64;

The initial choice is 1 for it won't have any change if the CMake flag is disabled. Only when it is enabled for runtime, then we change the internal implementation, change the code logic, do type conversion, etc.

Plan to go with choice 1

#ifdef WASM_ENABLE_MEMORY64 != 0
typedef linear_mem_ptr_t uint64;
#else
typedef linear_mem_ptr_t uint32;
#endif

4. Maximum size for memory64 linear memory

Theoretically, the linear memory for memory64 can be much larger even using our existing u32 data representation for page count, it is in the scale of hundreds of TeraBytes.

Should we have a practical maximum size for linear memory to control the actual usage of memory? Like 64GB or something

loganek commented 9 months ago

running memory32 wasm/aot files is not possible (though you can still run memory32 wasm/aot files with the default runtime compile without adding the cmake flags)

I wonder if we could consider adding support for running both wasm32 and wasm64 on wamr64 runtime from the start? Our team is very interested in that scenario as that'd enable us to do a/b testing without having sequential native updates. I'd be happy to help with the work around that.

WASMMemory stay unchanged

I agree that max page count bigger than UINT32 doesn't make sense; but just out of curiosity, would that break the ABI for AOT? Or what's the reason for not making the change to conform the spec? If it's due to ABI change, I think it's ok since we're breaking the ABI with this feature anyway.

Just FYI, I'm going to pick up the work on the wasi-libc to support it in the toolchain as well; the work has already been started in https://github.com/WebAssembly/wasi-libc/pull/444 but looks like it's stuck atm so I'm going to make some progress on that. That'd probably help us testing memory64 support in wamr with WASI a bit easier.

TianlongLiang commented 9 months ago

I wonder if we could consider adding support for running both wasm32 and wasm64 on wamr64 runtime from the start? Our team is very interested in that scenario as that'd enable us to do a/b testing without having sequential native updates. I'd be happy to help with the work around that.

Yes, I think we can definitely support both from the start, it wouldn't change too much from my original design, mostly some instantiating logic changes I guess. I was merely considering my bandwidth so aiming for simplicity for testing. With your team's help, I think we can definitely support both from the start.

I agree that max page count bigger than UINT32 doesn't make sense; but just out of curiosity, would that break the ABI for AOT? Or what's the reason for not making the change to conform the spec? If it's due to ABI change, I think it's ok since we're breaking the ABI with this feature anyway.

You are right that we are breaking AOT ABI anyway, but breaking ABI is not my main concern. I was thinking since we won't be using more pages, changing the type would change more code logic for no actual benefits but also take up more space, making the whole changes more complex.

I just realized that my statement can cause some confusion here. What I truly mean is that:

In the loader, we can conform to the spec, loading any uint64 values that the spec allows, but the wasm loader can set a special value(like UINT32MAX - 1) value to indicate it is larger than the allowed value(I was thinking maybe we can set a reasonable allowed value like 32GB or something?).
As long as you are not instantiating it, you are fine. But if you do, then in the wasm instantiating stage or in the wamrc compiling stage we will test if the special value of page count is set and correspondingly throw an exception.

Of course, I could miss something, feel free to point it out and we can discuss other alternatives together 😄

loganek commented 9 months ago

So we can change those types to:

typedef mem_offset_t = uint32/uint64; based on CMake flag uintptr_t; uint64;

I'd strongly discourage us from using uintptr_t - this is used for representing the host address, and using that for linear memory address is not good for the following reasons:

it might be not big enough (if we ever, for whatever reason, decide to run wasm64 on 32bit systems)
it might be unnecessarily big (e.g. in case of wasm32 running on x86_64)
it's confusing (if I see uintptr_t, I immediately think of the host virtual memory, not linear memory)

I think the first option:

#ifdef WASM64
typedef mem_offset_t uint64;
#else
typedef mem_offset_t uint32;
#endif

looks best. I'm not 100% sure though whether offset is the best name to represent the address; I'd assume the offset can be negative, but if we want to use it to represent an absolute address, then perhaps we could use mem_address_t or linear_mem_ptr_t etc.

TianlongLiang commented 9 months ago

looks best. I'm not 100% sure though whether offset is the best name to represent the address; I'd assume the offset can be negative, but if we want to use it to represent an absolute address, then perhaps we could use mem_address_t or linear_mem_ptr_t etc.

I think linear_mem_ptr_t is a solid choice

wenyongh commented 9 months ago

Hi, for the linear memory related APIs exposed (and its internal APIs), I think there may be several questions to discuss: what is type of related linear memory offset arguments/results in these APIs? Whether to define mem_offset_t or linear_mem_ptr_t type? And whether to use cmake flag to control it?

Note that developer may directly include wasm_export.h and use wamr static/shared lib in his project, in which the cmake flag in WAMR won't take effect in wasm_export.h, so we should not use internal macro to control API definitions in wasm_export.h.

So my opinion is that there may be two options:

One is to use uint64 as the type of linear memory offset for both 32-bit target and 64-bit target, and all related APIs are changed to uint64 offset. And no need to introduce mem_offset_t or linear_mem_ptr_t.
The other is to use uint32 in 32-bit target and uint64 for 64-bit target, and use macro #if UINTPTR_MAX == UINT64_MAX (or UINT32_MAX) to auto-detect the bit width of the target, and define mem_offset_t or something else accordingly. This supposes that 32-bit target only supports memory32 and memory64 is unsupported. And note that in 64-bit target, the offset is uint64 no matter the linear memory is memory32 or memory64.

I think the first one simplifies the implementation and reduces the maintain effort, but it may slightly impact the footprint and performance. Since there are too many features now and the source code is already very complex, it is my personal preference.

BTW, we had better refactor the current code to change API definitions and data structure definitions (memory instance, module instance, etc.) and resolve historical issues (e.g. UINT32_MAX linear memory size limitation), to make the AOT ABI well defined before we start to implement memory64, since we may release 2.0.0 after GC is merged and before memory64 is implemented.

wenyongh commented 7 months ago

Refactor APIs and data structures as preliminary work for Memory64 was merged into main.

wenyongh commented 7 months ago

https://github.com/bytecodealliance/wasm-micro-runtime/pull/3240 https://github.com/bytecodealliance/wasm-micro-runtime/pull/3260 https://github.com/bytecodealliance/wasm-micro-runtime/pull/3266

yamt commented 5 months ago

do you have any plans on table64? https://github.com/WebAssembly/memory64/issues/51

TianlongLiang commented 5 months ago

Currently, there is no plan regarding the table64 proposal implementation, a corresponding new RFC issue probably would be open if anyone were to implement it

trcrsired commented 4 months ago

They clearly f things up on 64 bit support. I want to duplicate all wasi apis to have a _wasm64 suffix but they seem to even ignore what i am saying. I would like to hear what you guys think.

gajanak commented 4 months ago

A small hint: If MEMORY64 is enabled , and a wasm64 is loaded, Strings are handled by I64 pointer isnteas i32 ?

Then the Signature Check should be patched, to support I64 for $ and for native functions. (wasm_native.c : 131 ) if (type->types[i] != VALUE_TYPE_I32) / pointer and string must be i32 type */ return false;

if WASM_ENABLE_MEMORY64 -> this should be VALUE_TYPE_I64

Perhaps this helps someone - wonder about signature mismatch error - experimenting with 64 Bit ;)

loganek commented 4 months ago

Thanks @gajanak , this is a known issue and we were planning to address it as part of the WASI work (overall, memory64 support is in still quite an experimental stage)

TianlongLiang commented 4 months ago

And one more TODO in the future, not WIP currently, but listed here for reference: Implement table64 extension

fridaymore commented 2 months ago

Great to see this being worked on -- any insight into when this feature will be ready for use in a development setting and in a production setting?

TianlongLiang commented 2 months ago

I start to work on table64 and hopefully, it can be ready in a month or so

bytecodealliance / wasm-micro-runtime

[RFC] Basic support for Memory64 proposal #3091

Summary

Overview of Basic Support Plan

Impact of memory64 proposal on WAMR's data structure

Topics for Further Discussion

1. Should Memory64 Only Be Supported on 64-Bit Machines?

2. Exported Runtime APIs for Host Use When Embedding WAMR VMCore

3. Internal API implementation choice

4. Maximum size for memory64 linear memory

if WASM_ENABLE_MEMORY64 -> this should be VALUE_TYPE_I64