scottredig / zig-javascript-bridge

Easily call Javascript from Zig wasm
MIT License
35 stars 2 forks source link

Support reading string from Javascript into Zig. #2

Open scottredig opened 3 months ago

scottredig commented 3 months ago

Needs to support two major use cases:

This is similar to how printing works. Having a reader may also be a good call. Error handling for allocator errors or slice being too small is a requirement.

I'm considering two different approaches:

  1. Change the return type argument in the various methods into a union. For the already existing types, they would be void type. Then add values for strings with take an allocator or a slice. The return type would match to (roughly) ![]u8. There are two considerations: The slice usage, knowing how many utf16 characters were encoded, and how many bytes of utf8 were written to the slice is occasionally important info that makes usage more awkward here, since an error would need to be returned in addition to these values. Secondly, I haven't dug into the details of how to actually do the string writing, especially for the allocator version which requires some back and forth between Zig and Js. This approach would require a cast method being added to Handle, which would operate like Handle.get, but operating on the Handle itself instead of a field. This is because sometimes a Zig function might be called with a Javascript string object as one of the arguments.
  2. The other approach is to add specific methods which copy from a handle to a string into Zig. This would have a lot less impact on the rest of the library, but would require the creation and release of Handles whenever accessing a string field. Bleh.

Finally, the implementation should be considering JS's TextEncoder methods of encode vs enocdeInto. EncodeInto obviously works better for the use case of sending to a slice. However, the better method for allocated strings is less clear. Using encode creates a whole copy of the string which is immediately discarded. However encodeInto requires the length to already be known, which requires a JS function that goes over the string counting the utf8 length. Alternatively it's safe to just allocate 3 times the utf16 length, which is always enough but also often very wasteful.

Convenient methods to do similar actions with Uint8Array should also be considered, and possibly leveraged in the allocator case.

scottredig commented 3 months ago

Another option is to support only utf16 string copying, since that's what javascript knows. Then have the Zig code be responsible for converting it into utf8. For the allocation case, this moves the extra allocation from JS to Zig, which is probably a performance improvement. (provided that the utf16 string isn't iterated before to predetermine length and then using encodeInto). For the slice case, a utf16 transfer is less error prone, since the consumed and written numbers are the same.

scottredig commented 2 months ago

So I was going to go down a rabbit hole of wanting different methods to be performance tested, but instead I think it'd be better to do the following, which shouldn't be hitting any performance pitfalls. Further improvements to performance can be made later as they shouldn't affect the api anyways.

4 public facing options should be available:

Finally, going from string into a preexisting slice uses the string encodeInto on a Uint8Array created that only sees the slice, to avoid the encodeInto possibly reading outside of its bounds.

Overall, I think these should be methods on the handle of the string (or Uint8Array) itself. So not messing around with the api related to getting fields or calling functions. I'm willing to be convinced otherwise on this, but I'm guessing the extra handle churn and management is worth not making the whole API much more complicated.

scottredig commented 2 months ago

@noxabellus take a stab at the above if you want to, otherwise I'll get to this when I'm done with live-webserver.