google / flatbuffers

FlatBuffers: Memory Efficient Serialization Library
https://flatbuffers.dev/
Apache License 2.0
23.33k stars 3.25k forks source link

flatbuffers for web assembly #4332

Closed 64BitAsura closed 7 years ago

64BitAsura commented 7 years ago

how difficult to make web assembly support

aardappel commented 7 years ago

WebAssembly is an intermediate format target for languages like C/C++/Rust etc, so you'd typically use FlatBuffers in those languages directly. While theoretically you could generate WebAssembly code directly, there would be no easy way to refer to it, and you'd miss out on the high level typing those languages give you.

KageKirin commented 7 years ago

I think, the question was supposed to be "are flatbuffers compatible with web assembly, i.e. does the lib compile with Emscripten?" Well, yes, that's the case. The C++ code can compile to WASM/ASM.js via Emscripten, as it's just classical C++11 source, so entirely compatible with Emscripten's Clang compiler.

psy0rz commented 3 years ago

i wonder if the performance would be better than the native flatbuffers JS implementation?

aardappel commented 3 years ago

@psy0rz that may be difficult, since most FlatBuffers accessors are very little code (just a few memory reads), so the advantage of them being in C++/Wasm may be dwarfed by the fact that they need to cross the Wasm<->JS boundary (which is still expensive in V8 currently) and may need a JS wrapper for conversion to JS values.

It would certainly be a nice way to bring more expensive functionality to JS, i.e. the C++ schema parser, JSON parser and JSON output could all be provided via Wasm.

oberstet commented 1 year ago

fwiw, I've thought about WASM target for Flatbuffers, and IMO, the "problem" to consider is: WASM supports exactly 4 scalar types i32, i64, f32, f64 (plus v128, but that's a vector already)

https://webassembly.github.io/spec/core/syntax/types.html#number-types https://webassembly.github.io/spec/core/binary/types.html#number-types

everything else, including bools, strings and bytes is shoved into above 4. and a host language targeting WASM will include code for appropriate accessors already?

aardappel commented 1 year ago

@oberstet that's the problem, Wasm isn't a programming language, its more like a (virtual) CPU. We don't have a x86 target for FlatBuffers either, because it would run into the same problem: what string representation do you want, exactly? A string would be an i32 (pointer) on wasm32, but how things are laid out and managed in memory is a question only a specific programming language implementation can answer.

Even the upcoming Wasm GC proposal which introduces higher level types like arrays and structs outside of a single memory, still will have this problem as any GC language may use different combination and layouts of these types what it considers a "string" or any other type, including placement of metadata/vtables etc etc.

oberstet commented 1 year ago

that's the problem, Wasm isn't a programming language, its more like a (virtual) CPU. ... how things are laid out and managed in memory is a question only a specific programming language implementation can answer

@aardappel yeah, true, both.

I stumbled across this issue since I've recently looked into how to communicate efficiently between different guest languages running on a WASM host.

On the one hand, I think Flatbuffers could still be a powerful tool. Eg when sending flat WASM memory areas over the context boundary (host/guest), and then using generated source language bindings via Flatbuffers. However, my worry is: this generated code isn't optimized to be shoved into the rigid 32/64 words model of WASM .. will see.

The other alternative I am looking at is the "WebAssembly Component Model"

fwiw, thus stuff looks quite good, and people involved have many decades experience (as in: "so WIT is the new CORBA/DCOM? No! The new COM - any remoting/networking is not WITs business - couldn't agree more;)

aardappel commented 1 year ago

@oberstet I argued for (something like) FlatBuffers to be the interaction model when we first talked about "Interface Types" in Wasm (now absorbed in the "Component Model" I guess?), or at least, choosing to use FlatBuffers for this should be really easy with focus on good support for binary buffers.

But even if that's the model, you don't need a Wasm implementation for FlatBuffers, you just need the languages on both sides of the fence to have its own implementation of FlatBuffers. You just need to agree on what a buffer is and how its managed.

oberstet commented 1 year ago

I argued for (something like) FlatBuffers to be the interaction model when we first talked about "Interface Types" in Wasm ...

ok, I see! wasn't aware of your involvement, and I only became aware of WIT just now, but my first reaction was the same: why not build on Flatbuffers IDL? I think the Flatbuffers IDL is quite good, concise and extensible via attributes.

Eg I've used it to specify "service interfaces" which include both "procedures" and "topics", and can be "provided" and "used" (the latter 2 are called "import" and "export" in WIT)

https://github.com/crossbario/crossbar-examples/tree/master/payload-validation#type-catalogs

Also, I think the built-in scalar types are more practical than compared to WASM - which has a weird choice of fixed size scalars 32b/64b only, but including both integers and IEEE754.

My problem with the latter is (for the current context I am relooking into this stuff): it is non-deterministic (IEEE754 leaves "implementation defined" behavior like in C).

And that doesn't work for blockchains and verifiable computation in general.

The Ethereum VM is using no floats at all and only 1 fixed size scalar 256b. Obviously, not native to CPUs. The weirdest non-standard thing: newish VMs which use F_p with p prime to do arithmetics;) Anyways.

But even if that's the model, you don't need a Wasm implementation for FlatBuffers ....

Sure, using flatc at build-time would be enough.

But I would also want to have run-time support like (which could be done of course)

and a support library for

However, the last one points to a problem: one needs a source lang => WASM translator (eg Rust compiler) at run-time for dynamic upgrading of the generated binding when the type library changes.

Mmmh, wait ... Flatbuffers itself obviously allows schema evolution .. if I update a FBS type library, my existing client with the binding generated for the old schema still can access the new data.

So yes, I guess I agree: Flatbuffers IDL seems to check it all=) Thanks for your hints, opinions and the conversation!

oberstet commented 1 year ago

ahh, WIP canonicalizes float NaNs to

CANONICAL_FLOAT32_NAN = 0x7fc00000
CANONICAL_FLOAT64_NAN = 0x7ff8000000000000

https://github.com/WebAssembly/component-model/blob/main/design/mvp/CanonicalABI.md#loading