RoyalIcing / Orb

Write WebAssembly with Elixir
https://useorb.dev
BSD 3-Clause "New" or "Revised" License
166 stars 1 forks source link

Get string address and type #36

Open orsinium opened 2 weeks ago

orsinium commented 2 weeks ago

According to the docs, strings get transformed into Orb.Memory.Range (the docs say Slice instead of Range but I think it's a typo). However, it's not what happens. When I define a function:

  def log_debug(s) do
    Firefly.Bindings.Misc.log_debug(
      Orb.Memory.Range.get_byte_offset(s),
      Orb.Memory.Range.get_byte_length(s)
    )
  end

And then call this function:

log_debug("hello")

The function receives a string, not a Range:

** (FunctionClauseError) no function clause matching in Orb.Memory.Range.get_byte_offset/1    

    The following arguments were given to Orb.Memory.Range.get_byte_offset/1:

        # 1
        "hello"

    Attempted function clauses (showing 2 out of 2):

        def get_byte_offset(-<<byte_offset::integer-little-unsigned-size(32), _::integer-little-unsigned-size(32)>>-)
        def get_byte_offset(-range = %{push_type: _}-)

    (orb 0.0.46) lib/orb/memory/range.ex:29: Orb.Memory.Range.get_byte_offset/1
    (firefly 0.1.0) lib/misc.ex:6: Firefly.Misc.log_debug/1
    (firefly 0.1.0) lib/demo/triangle.ex:5: Firefly.Demo.Triangle.__wasm_body__/1
    (orb 0.0.46) lib/orb/module_definition.ex:45: Orb.ModuleDefinition.get_body_of/1
    (orb 0.0.46) lib/orb/compiler.ex:18: Orb.Compiler.run/2
    (firefly 0.1.0) lib/demo/triangle.ex:1: Firefly.Demo.Triangle.__wasm_module__/0
    (orb 0.0.46) lib/orb.ex:1011: Orb.to_wat/1
    (firefly 0.1.0) lib/mix/tasks/wasm.ex:11: Mix.Tasks.Wasm.run/1

I guess it gets transformed only when passing into a host-defined function? The problem I'm trying to solve is that while the Range type packs both the string offset and string len into a single i64 value, I need to pass into the host two separate i32 values for offset and length (as shown in the snippet above).

orsinium commented 2 weeks ago

For reference, here is how the same function looks in Rust:

https://github.com/firefly-zero/firefly-rust/blob/main/src/misc.rs#L4-L10

RoyalIcing commented 1 week ago

Yes apologies, this is one of the main decisions to be made for the alpha — how to model strings: #7. Currently in Orb it’s a i32 pointer to a nul-terminated string. But those have security issues by making it too easy to create buffer overflow or underflow.

So I’d like to always have the string length included whenever you reference a string. But WebAssembly doesn’t let you pass around tuples, so the packed i32+i32 into a i64 is my best idea currently.

The Range and Slice naming is another decision to be made. Orb doesn’t have an ownership model, the memory is just there and you can create pointer references to parts. So I want that to be clear in the name, but I also want it to feel natural similar to how you work with strings in other languages. Are you working with a pointer range or a slice of the memory? i.e. Is the thing you are working with the pointer or the memory itself? I’m leaning towards Slice and that’s why that crept into the site before the code actually has been updated.

I’ll make progress on this soon and make a PR to the https://github.com/firefly-zero/firefly-elixir project. It looks really great and readable!

Feel free to share your opinions on the above as I’m curious what you find works well with say Zig and Rust.

orsinium commented 1 week ago

Wasm supports tuples. You can return multiple values from a function (which is supported by all runtimes), and you can pass represent as multiple arguments in wasm what is one argument in your code.

I'm writing a programming language that compiles into wasm. Well, not right now, I switched to Firefly for now, but I before that I made good progress on it. Ask me questions if you get stuck.

For representing values in memory and passing them around and things like that I suggest following the ABI described in the component model.

RoyalIcing commented 1 week ago

Sure, would appreciate any knowledge you can bring!

Multiple return values are awesome but the limitation is you can’t store tuples in locals. Orb currently exposes the underlying primitives as-is, so to support more complex locals I’d either have to create a C-like stack abstraction, or wait for the component model.

Orb targets currently shipping WebAssembly runtimes, so it’s going to be pretty conservative even when the component model matures.

orsinium commented 1 week ago

You still can make more locals for tuples, and nobody will complain. If in Elixir code variable a stores a 2-element tuple, just make a.0 and a.1 locals in wasm.

orsinium commented 1 week ago

The component describes many things. One of which is standard ABI, and you should follow that ABI when storing values in the linear memory. You don't need any special support from the runtime side to do that.