sciencesakura / mutf-8

An encoder/decoder for Modified UTF-8 which is used in the Java platform such as the class file format and the object serialization.
MIT License
7 stars 1 forks source link

AssemblyScript WASM Build #23

Open Offroaders123 opened 6 months ago

Offroaders123 commented 6 months ago

Would building a WASM binary from the same TypeScript code, by way of AssemblyScript, provide any noticeable performance gains I wonder? I haven't had any challenges with speed at all with how the library currently stands, but I wonder if it's something that could help for really large/lots of buffers to encode and decode? It would be more of an enhancement for the greater scheme of things, if it did have any noticeable changes.

I've been pondering this on my own, and I was curious if you may have already considered this before. It might not have any notable performance benefits over the plain JS implementation here, but I'm curious enough to look into it.

As a concept, a project like this seems to be one of the few examples where something like AssemblyScript could actually work great.

I think another reason to why I thought of this is because I thought I remember hearing at some point that the built-in Text Encoding APIs are backed by native code, and I wondered if moving to a JS-based approach would have any noticeable speed differences compared to the native one.

Thanks again for making this project! I think I'm looking to use it for my Minecraft NBT parsing library, NBTify.

https://github.com/Offroaders123/NBTify/issues/42

Offroaders123 commented 6 months ago

I tried this out with my NBT parser, and from my tests at least, it seemed like this MUTF-8 implementation is just a hair behind the Text Encoding APIs themselves, but it did fluctuate in a few cases where this one was actually faster. That was just with the plain JS implementation here, I didn't get a WASM build setup yet. If anything, it's just down to the tenths or hundredths of a millisecond, so it's barely any different if anything. Overall it did take longer though, so I'm going to see if WASM does anything different as well. I kind of expect it to be slower than this plain setup, since it will have to do all of the initial setup to get the WASM space configured.

UTF-8

utf-8

MUTF-8

mutf-8

sciencesakura commented 5 months ago

@Offroaders123

Thank you for your thoughtful suggestion and interest in this project!

I haven't yet considered the approach you present in detail, but your points about potential performance gains, especially when dealing with large messages, and the suitability of AssemblyScript for such a simple library make sense.

On the other hand, I believe that a simple library like this one can also benefit from the optimizations provided by the JS runtime. This could mean that the performance gains from switching to AssemblyScript might not be as significant as expected, but it's certainly worth investigating.

Thank you again. If you have any findings on this matter, I would love to hear them.