RoyalIcing / Orb

Write WebAssembly with Elixir
https://useorb.dev
BSD 3-Clause "New" or "Revised" License
174 stars 1 forks source link

Move away from nul-terminated strings? #7

Closed RoyalIcing closed 4 days ago

RoyalIcing commented 8 months ago

Orb conveniently converts strings like "abc" into an integer “pointer”, and automatically adds an (data) entry initialized in memory at that pointer offset.

These strings are modelled after C’s strings. They are nul-terminated, which means you must measure their length at runtime by looping over every character until you hit \0.

This means the zero byte is not possible to encode, which is important for some protocols to include.

I think the general consensus (citations needed) is that people consider C’s nul-terminators a mistake. Better to have an explicit length stored somehow. This is faster as you don’t have to iterate over the string to know its length, and likely safer as you can’t change a string’s length by merely flipping a byte to/from 0x0. It also lets you extract many “slices” from the string by advancing the start offset and shortening the length.

Proposed options:

  1. Two i32s: memory-offset and length. Downside is that passing it to a function now requires two arguments, which is much harder for our macros to handle. You’d have to pass around a string in two parts, always remembering to keep the two variables together, and requiring some naming convention like _str and _len suffixes.
  2. Single i64, with first 32-bits as the memory-offset and the second 32-bits for length. This is much easier to pass to functions as it’s a single value. This could be represented by a type like Memory.Range. The downside is you’d need some lightweight inlined macros to extract the offset and length. And it might be a “weird” Orb-only convention. I’d prefer a solution that is elegant and obvious.

Further things to consider

RoyalIcing commented 5 months ago

This will also mean removing the Orb.I32.String module with its functions. Which is great as I’d prefer to have the amount of included boilerplate code to a minimum.

RoyalIcing commented 1 month ago

Orb.I32.String has been removed.

RoyalIcing commented 4 days ago

Orb.Str has been added, which is a Custom Type for (i32 i32). The first i32 is the string’s memory address, and the second is its length in bytes.