AssemblyScript / assemblyscript

A TypeScript-like language for WebAssembly.
https://www.assemblyscript.org
Apache License 2.0
16.82k stars 655 forks source link

Implement `unknown` for holding native pointers. #2859

Closed sasq64 closed 1 month ago

sasq64 commented 3 months ago

Feature suggestion

Implement a type that can hold a u64 or u32 that can not be created from Assemblyscript, only received from native functions, and not be manipulated in any way (similar to Typescript 'unknown').

This would allow more safety when passing native pointers back and forth to AS.

MaxGraey commented 3 months ago

'unknown' is not a pointer. It's a top type. If you need a runtime variant for AS you can use as-variant

sasq64 commented 3 months ago

The type can be called something else. The important part is that it is opaque to AssembyScript. It should not assume anything about it, and should only be allowed to pass it along further.

HerrCai0907 commented 3 months ago

What is this type in wasm level? type is not only for diagnose, It should be mapped to lower level memory layout.

Maybe you can describe the high level requirement. For example, what kind of native functions. Then we can discuss more generically. As your current description, it looks like externref.

CountBleck commented 3 months ago

Yes, I agree with @HerrCai0907 in that the type that best fits your description is externref. However, externref can't be a class field, because storing values requires them to have an in-memory representation, which externref, being opaque, lacks. If you could store them in memory, it would defeat the whole point (since you could modify it there and load it back).

sasq64 commented 3 months ago

Well I would probably need to store it.

Thinking more about it I guess the clearest would maybe be a type opaque_ptr, that is stored as a u64 or u32 (configurable depending on host) and where the AS side is not allowed to write values into it.

MaxGraey commented 3 months ago

I think you're making up some very far-fetched problem. No one has ever had this problem before. Everyone used usize or u32 for pointers for FFI because WebAssembly itself can address only within 32-bit space (practically nobody supports multi-memory with 64-bit linear space now). Other languages also don't use anything like this for addressing. I think this is because you have not fully understood what linear memory is in WebAssembly and that it cannot be directly mapped from host memory space and vice versa. In any case, you will have to do some actions to normalize the host address space into WebAssembly's linear-indexed space, regardless of the width of the host address word itself

sasq64 commented 3 months ago

I think I just wasn't clear on my use case.

I don't care about pointers/memory in web assembly space at all. I create objects on the native side and just need the wasm/AS side to hold them.

Like, in C I have defined and exported

Button* create_button();
void set_button_color(Button*, uint32_t col);

and from AS I call this like

let btn = env.create_button();
env.set_button_color(btn, 0xff00ff);

The way I handle that now is to generate prototypes like

export declare function create_button(): object

and compile with --use=object=u64 or --use=object=u32 depending on my host platform.

(I repurpose object to let typescript aware editors to not show hundreds of errors for an unknown type).

MaxGraey commented 3 months ago

declare type opaque = number?

But again, it doesn't matter what bitness the host platform has, in any case WebAssembly can only address 32-bit linear space at the moment

sasq64 commented 3 months ago

The size of pointers on the host matters only because I need to store them in the correct sized type on the WASM side, and make sure they are passed correctly from the host (which is a bit tricky with WAMR).

And the size of the WASM memory space does not matter at all.

HerrCai0907 commented 3 months ago

You should use u64 to store host pointer if you want to consider compatibility for 64bit host machine and 32bit host machine. Otherwise it is not a reusable wasm module in different host.

sasq64 commented 3 months ago

The problem is that a native function void hey(void* ptr) will need a different signature depending on the host. I would have to wrap all functions that takes pointers like;

void hey2(uint64_t data) {
  assert(data & 0xffffffff00000000 == 0);
  hey((void*)data);
}

And the default target is a 32-bit microcontroller so I don't like the added indirection and complexity just to support the desktop (debug) version...

But ideally, yes, I would like to avoid a "dynamic" type like this.

HerrCai0907 commented 3 months ago

I recommend to use externref for this case.

sasq64 commented 3 months ago

As mentioned above I will probably eventually want to store it, so I don't want to use that.

dcodeIO commented 3 months ago

Just in case it's useful: If necessary, could also emulate what externref + a table would natively provide by maintaining a custom index => ptr mapping on the host side and passing only the (32-bit) index through Wasm. By passing around the index, the Wasm can't somehow corrupt a pointer value.

CountBleck commented 3 months ago

While that might generally be a good idea on the web, that probably has a performance hit. That's okay if manipulating native pointers is a security concern though...

@sasq64 are you running untrusted code on the microcontroller?

sasq64 commented 3 months ago

No, it will be our own code, and will be signed as well - so this is mostly to guard against programming errors. So we can live with the current object=u64/u32 workaround.

github-actions[bot] commented 2 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in one week if no further activity occurs. Thank you for your contributions!