Open TianlongLiang opened 9 months ago
running memory32 wasm/aot files is not possible (though you can still run memory32 wasm/aot files with the default runtime compile without adding the cmake flags)
I wonder if we could consider adding support for running both wasm32 and wasm64 on wamr64 runtime from the start? Our team is very interested in that scenario as that'd enable us to do a/b testing without having sequential native updates. I'd be happy to help with the work around that.
WASMMemory stay unchanged
I agree that max page count bigger than UINT32 doesn't make sense; but just out of curiosity, would that break the ABI for AOT? Or what's the reason for not making the change to conform the spec? If it's due to ABI change, I think it's ok since we're breaking the ABI with this feature anyway.
Just FYI, I'm going to pick up the work on the wasi-libc to support it in the toolchain as well; the work has already been started in https://github.com/WebAssembly/wasi-libc/pull/444 but looks like it's stuck atm so I'm going to make some progress on that. That'd probably help us testing memory64 support in wamr with WASI a bit easier.
I wonder if we could consider adding support for running both wasm32 and wasm64 on wamr64 runtime from the start? Our team is very interested in that scenario as that'd enable us to do a/b testing without having sequential native updates. I'd be happy to help with the work around that.
Yes, I think we can definitely support both from the start, it wouldn't change too much from my original design, mostly some instantiating logic changes I guess. I was merely considering my bandwidth so aiming for simplicity for testing. With your team's help, I think we can definitely support both from the start.
I agree that max page count bigger than UINT32 doesn't make sense; but just out of curiosity, would that break the ABI for AOT? Or what's the reason for not making the change to conform the spec? If it's due to ABI change, I think it's ok since we're breaking the ABI with this feature anyway.
You are right that we are breaking AOT ABI anyway, but breaking ABI is not my main concern. I was thinking since we won't be using more pages, changing the type would change more code logic for no actual benefits but also take up more space, making the whole changes more complex.
I just realized that my statement can cause some confusion here. What I truly mean is that:
In the loader, we can conform to the spec, loading any uint64 values that the spec allows, but the wasm loader can set a special value(like UINT32MAX - 1) value to indicate it is larger than the allowed value(I was thinking maybe we can set a reasonable allowed value like 32GB or something?).
As long as you are not instantiating it, you are fine. But if you do, then in the wasm instantiating stage or in the wamrc compiling stage we will test if the special value of page count is set and correspondingly throw an exception.
Of course, I could miss something, feel free to point it out and we can discuss other alternatives together 😄
So we can change those types to:
typedef mem_offset_t = uint32/uint64; based on CMake flag uintptr_t; uint64;
I'd strongly discourage us from using uintptr_t
- this is used for representing the host address, and using that for linear memory address is not good for the following reasons:
uintptr_t
, I immediately think of the host virtual memory, not linear memory)I think the first option:
#ifdef WASM64
typedef mem_offset_t uint64;
#else
typedef mem_offset_t uint32;
#endif
looks best. I'm not 100% sure though whether offset
is the best name to represent the address; I'd assume the offset can be negative, but if we want to use it to represent an absolute address, then perhaps we could use mem_address_t
or linear_mem_ptr_t
etc.
looks best. I'm not 100% sure though whether offset is the best name to represent the address; I'd assume the offset can be negative, but if we want to use it to represent an absolute address, then perhaps we could use mem_address_t or linear_mem_ptr_t etc.
I think linear_mem_ptr_t
is a solid choice
Hi, for the linear memory related APIs exposed (and its internal APIs), I think there may be several questions to discuss: what is type of related linear memory offset arguments/results in these APIs? Whether to define mem_offset_t or linear_mem_ptr_t type? And whether to use cmake flag to control it?
Note that developer may directly include wasm_export.h and use wamr static/shared lib in his project, in which the cmake flag in WAMR won't take effect in wasm_export.h, so we should not use internal macro to control API definitions in wasm_export.h.
So my opinion is that there may be two options:
uint64
as the type of linear memory offset for both 32-bit target and 64-bit target, and all related APIs are changed to uint64 offset. And no need to introduce mem_offset_t or linear_mem_ptr_t.uint32
in 32-bit target and uint64
for 64-bit target, and use macro #if UINTPTR_MAX == UINT64_MAX (or UINT32_MAX)
to auto-detect the bit width of the target, and define mem_offset_t or something else accordingly. This supposes that 32-bit target only supports memory32 and memory64 is unsupported. And note that in 64-bit target, the offset is uint64 no matter the linear memory is memory32 or memory64.I think the first one simplifies the implementation and reduces the maintain effort, but it may slightly impact the footprint and performance. Since there are too many features now and the source code is already very complex, it is my personal preference.
BTW, we had better refactor the current code to change API definitions and data structure definitions (memory instance, module instance, etc.) and resolve historical issues (e.g. UINT32_MAX linear memory size limitation), to make the AOT ABI well defined before we start to implement memory64, since we may release 2.0.0 after GC is merged and before memory64 is implemented.
Refactor APIs and data structures as preliminary work for Memory64 was merged into main.
do you have any plans on table64? https://github.com/WebAssembly/memory64/issues/51
Currently, there is no plan regarding the table64 proposal implementation, a corresponding new RFC issue probably would be open if anyone were to implement it
They clearly f things up on 64 bit support. I want to duplicate all wasi apis to have a _wasm64 suffix but they seem to even ignore what i am saying. I would like to hear what you guys think.
A small hint: If MEMORY64 is enabled , and a wasm64 is loaded, Strings are handled by I64 pointer isnteas i32 ?
Then the Signature Check should be patched, to support I64 for $ and for native functions. (wasm_native.c : 131 ) if (type->types[i] != VALUE_TYPE_I32) / pointer and string must be i32 type */ return false;
Perhaps this helps someone - wonder about signature mismatch error - experimenting with 64 Bit ;)
Thanks @gajanak , this is a known issue and we were planning to address it as part of the WASI work (overall, memory64 support is in still quite an experimental stage)
And one more TODO in the future, not WIP currently, but listed here for reference: Implement table64 extension
Great to see this being worked on -- any insight into when this feature will be ready for use in a development setting and in a production setting?
I start to work on table64 and hopefully, it can be ready in a month or so
Summary
Plan to implement basic support for the memory64 proposal, first focusing on the classic interpreter and AOT running modes.
Overview of Basic Support Plan
With basic support completed, users should experience the following:
Use
cmake -DWAMR_BUILD_MEMORY64=1
to compile a memory64-enabled iwasm VM core. Initially, only memory64-enabled wasm/aot files will be supported. In other words, a memory64-enabled runtime VM core will only support memory64 wasm/aot files for basic support; running memory32 wasm/aot files is not possible (though you can still run memory32 wasm/aot files with the default runtime compile without adding the cmake flags). Further development will ensure compatibility of memory64-enabled runtime with both memory32 and memory64 wasm/aot files. Maybe make them compatible from the startFor the wamrc AOT compiler, no extra work is required during compile time. To compile a memory64 aot file, use
wamrc --enable-memory64 -o test.aot test.wasm
to generate a memory64 aot file.When you integrate a WAMR VMcore into your host application, you can use certain APIs that WAMR exports. These APIs help manage linear memory and enable the conversion and verification of addresses between the host's native environment and the app's linear memory. However, to accommodate memory64, these APIs will undergo modifications. The main changes will be in the types of arguments and return values. Specifically, the representation of a linear memory address will shift from uint32 to uint64. This change applies regardless of whether you enable memory64. In a memory32 scenario, this adjustment is seamless, as there is no narrow conversion involved. Instead, uint32 values are simply expanded to uint64. This modification affects the following APIs, here is a preview of their signature modifications:
Impact of memory64 proposal on WAMR's data structure
This section lists the changes in the memory64 proposal that could potentially influence WARM's internal data structure and code logic. Let's see what those influences could be
Memory initial/max page count: According to the memory64 proposal, when memory64 is used, the default page size remains unchanged(64KB), but the initial page count/max page count theoretically can maximally be 2^48 now, which is roughly 2.2 million terabytes. Given it's kind of surreal to use that much memory in real life, stick with the uint32 representation of page count.
To conclude, the data structure represents WASMMemory stay unchanged, the initial page count and max page count stay as u32.
Data segment offset: for memory64, it should be should be i64 now
WASMDataSeg
, the offset's type will become i64AOTMemInitData
Memory instruction: The type of memarg’s offset in all memory instructions needs to be changed, as well as the parameters representing linear memory addresses taken by the memory instruction on the stack, a total of 116 opcodes.
WASMModule
andAOTModule
the fields that represent linear memory address, use uint64 instead:Also in AOT file, those fields will be emitted as u64. So in a sense, AOT ABI will change.
WASMMemoryInstance(AOTMemoryInstance)
memory_data_size type changes can result to AOT offset calculation logic changes(details are in
create_memory_info
)WASMExecEnv
Topics for Further Discussion
😄 Suggestions or input on the following topics (or others you can think of) are appreciated.
1. Should Memory64 Only Be Supported on 64-Bit Machines?
I believe it's not very useful to support the memory64 proposal on 32-bit machines, as more than 4GB of memory can't be utilized anyway. If linear memory is below 4GB, sticking with default memory32 seems sensible. Supporting memory64 on 32-bit machines would also add implementation complexity. Therefore, the initial plan is not to support the memory64 proposal on 32-bit machines.
2. Exported Runtime APIs for Host Use When Embedding WAMR VMCore
As previously mentioned, the initial plan is to add new APIs like
wasm64_xxx
for memory64 wasm/aot files, while keeping the old APIs unchanged for memory32 wasm/aot files.Alternatively, we could modify existing APIs without adding new ones.
These two choices are worth considering. The first fits well on 32-bit platforms, and the second maintains consistent function signatures. At first glance, they seem preferable to adding new
wasm64_xxx
APIs. However, they could cause problems for existing programs that want to be compiled with the updated version of WAMR, which is why adding new APIs is the primary choice. Additionally, new APIs can reduce programming errors and enhance readability by clearly indicating the memory standard used for a given WASM file/module, with explicit return value types.3. Internal API implementation choice
Adding new export APIs or changing the export APIs requires corresponding changes in internal implementation APIs too. For example:
The ptr, size, and returned ptr should be uint64 when it's Memory64.
So we can change those types to:
The initial choice is 1 for it won't have any change if the CMake flag is disabled. Only when it is enabled for runtime, then we change the internal implementation, change the code logic, do type conversion, etc.
4. Maximum size for memory64 linear memory
Theoretically, the linear memory for memory64 can be much larger even using our existing u32 data representation for page count, it is in the scale of hundreds of TeraBytes.
Should we have a practical maximum size for linear memory to control the actual usage of memory? Like 64GB or something