Samsung / walrus

WebAssembly Lightweight RUntime
Apache License 2.0
37 stars 10 forks source link

Implementing instance offsets #73

Closed zherczeg closed 1 year ago

zherczeg commented 1 year ago

I have started to implement a low-level interface for instance. It is just a concept, it can be changed in any way. The key points:

What do you think?

zherczeg commented 1 year ago

The offsets can be stored in Module not in instance, since they are the same for all instances. This is only needed for high level access, since the byte code and jit code use these byte offset constants directly.

clover2123 commented 1 year ago

All WebAssembly data has a byte offset, which is computed at compile time. Both interpreter and jit can use these offsets to access the data, no std::vector or std::shared_ptr is required. The offsets can be part of jit instructions and interpreter byte codes.

What is WebAssembly data here? you mean Memory which is linearly allocated at compile time? Or including all data structures such as ImportType, ExportType, Memory, Table, Funciton ... etc that are required during the wasm execution?

The WebAssembly data is now the full data structure (e.g. memory), not a reference pointer. This can be changed if reference support is needed.

Yes, current wasm standard specifies that only numbers can be stored in the Memory structure and only one Memory structure is allowed for each Module. But several extern objects like Function, Memory, Table, Global can be shared with other Modules (Instances) through import/export. We need to manage deallocation of these shared structures carefully.

zherczeg commented 1 year ago

What is WebAssembly data here? you mean Memory which is linearly allocated at compile time? Or including all data structures such as ImportType, ExportType, Memory, Table, Funciton ... etc that are required during the wasm execution?

All data which JIT needs to access directly and might be different for an instance, such as raw memory or globals. Other read only data such as indirect function call table should be accessed somehow. As far as I remember this table is the same for all instances, so it can be stored in the module. JIT can access raw pointers (and primitive types) inside c++ structures, but cannot use std::vector or std::refptr high level constructs.

But several extern objects like Function, Memory, Table, Global can be shared with other Modules (Instances) through import/export.

The ownership is important here. If the instance owns Memory objects or global values, and others can access it indirectly, it is faster since the resource can be accessed directly. If the resource is shared, and the instance only has a pointer to it, reading the pointer has a performance overhead. If reference counting is needed, maybe we should implement it ourselves (probably creating some base class for it).

In other words the question is where the resource is stored.

a) If the resource is part of an instance, the access is faster (jit / interpreter runs faster), but sharing the resource is limited (two instances can not use the same memory or global). In this case each resource has a byte offset inside the instance memory.

b) The resources are independent memory objects with reference counters. In this case the instance has an array of pointers after the end of the instance to each resource. JIT / interpreter knows the pointer index when a resource is accessed. The pointers of different resource types can be in any order, even mixed, since the pointer index is computed at compile time, and should be part of the byte code and jit instructions.

Which one do you prefer? Or do you have another suggestion?

clover2123 commented 1 year ago

As far as I remember this table is the same for all instances, so it can be stored in the module.

As you mentioned, offset of data is same for all instances. But each data needs to be newly created when instantiated from the module. IMO data should be stored in instance, not module. For example, two instances are instantiated from the same module and instances have a table each. But these tables are not shared between the two instances, they just stored at the same location(offset) inside instance (correct it if wrong). Data sharing between instances is enabled only through import/export.

If reference counting is needed, maybe we should implement it ourselves (probably creating some base class for it).

I'm working on reference counting method. It will be updated soon.

b) The resources are independent memory objects with reference counters. In this case the instance has an array of pointers after the end of the instance to each resource. JIT / interpreter knows the pointer index when a resource is accessed. The pointers of different resource types can be in any order, even mixed, since the pointer index is computed at compile time, and should be part of the byte code and jit instructions.

Since data (function, table, memory, global) could be generated out of wasm via wasm-c-api or wasm-js api, It seems better to have an array of data pointers in instance. And a global structure Store needs to manage the life cycle of resources, for example, deallocate zero-ref-counted resources. The overall design of data management is quite complicated when we consider APIs together :(

zherczeg commented 1 year ago

But each data needs to be newly created when instantiated from the module. IMO data should be stored in instance, not module.

I assumed the list of functions in the indirect function call table is immutable, so I thought it can be stored in the module to reduce memory usage. But if it can be modified, then you are right, we need to store it in the instance, since different instances can have different lists.

I'm working on reference counting method. It will be updated soon.

Thank you.

It seems better to have an array of data pointers in instance.

You are right. The implementation allows a great freedom to use these resources. Values can still be cachable when we know it cannot be changed.

The overall design of data management is quite complicated when we consider APIs together :(

True. Unfortunately the implementation of std::vector and similar templates are compiler dependent, so a gcc or clang, or even different versions of gcc might do this differently. So JIT stucks with primitive values (pointers, integers). Hopefully we can add a nice C++ interface for hiding the low level access in C++.

clover2123 commented 1 year ago

I assumed the list of functions in the indirect function call table is immutable, so I thought it can be stored in the module to reduce memory usage.

table could be modified through table.set instruction, so call table is mutable.

Unfortunately the implementation of std::vector and similar templates are compiler dependent, so a gcc or clang, or even different versions of gcc might do this differently. So JIT stucks with primitive values (pointers, integers). Hopefully we can add a nice C++ interface for hiding the low level access in C++.

Regarding vector structure, we use own Vector structure which is defined in https://github.com/Samsung/walrus/blob/interp/src/util/Vector.h Vector is simply composed of 3 elements, which are buffer address, size of vector and current capacity of vector as below. With this Vector structure, would it be possible to handle values in JITC? or Do you have any preferable array structure for JITC? Let me know if you have and I'll update walrus based on the new array design.

template <typename T>
Vector<T> {

  T* m_buffer;
  size_t m_size;
  size_t m_capacity;
}

In pr#74, I removed all shared pointers and fixed Store structure to manage all Objects that need to be deallocated. To be specific, Store holds all addresses of modules, instances and extern objects and dealloates them at the end of walrus execution (in Store's destructor). This approach is not good for memory efficiency as all structures are maintained during the runtime regardless of its validity. But IMO this approach is pretty much simpler way to manage Object's life cycle and also make JITC easily access data address. Moreover, modules and instances are already maintained in Store ahead of pr#74, so most of created Objects are deallocated at the end of walrus execution.

zherczeg commented 1 year ago

With this Vector structure, would it be possible to handle values in JITC?

I have checked it, and offsetof(Vector<void>, m_buffer) works as long as m_buffer is public.

The majority of elements (such as globals) should be in the pointer list after the instance, but some elements such as indirect function references needs accessing vectors.

74 is a nice patch! Basic reference counting should not be a problem in jit as long as the reference counter never reaches zero, so the objects do not need to be freed by jit.

zherczeg commented 1 year ago

I have reworked the core patch, using pointers now. So the instance is followed by the list of pointers which represents various webassembly data.

clover2123 commented 1 year ago

@zherczeg Sorry for late review. It now seems that this patch replaces vector members by raw pointers. Our Vector structure is composed of simple structure, but is it still difficult to access by JIT code?

zherczeg commented 1 year ago

I have changed all types.