inkeliz / karmem

Karmem is a fast binary serialization format, faster than Google Flatbuffers and optimized for TinyGo and WASM.
BSD 3-Clause "New" or "Revised" License
654 stars 27 forks source link

Better Speed Comparison #31

Open pascaldekloe opened 2 years ago

pascaldekloe commented 2 years ago

The Go implementation of Flatbuffers is notoriously slow. It would be good to see an actual comparison with other formats. For Go we have https://github.com/alecthomas/go_serialization_benchmarks.

inkeliz commented 2 years ago

I would like to add more benchmarks, including compare other languages (such as C#, Swift and C, which also have support for Flatbuffers). Actually, I'm testing Flatbuffers on C# and compare it.


I think exists some issues on the mentioned benchmark. Of course, no benchmark is perfect, and the current one isn't too. However, taking a look at the source-code will note something like:

func Benchmark_Ikea_Marshal(b *testing.B) {
    buf := new(bytes.Buffer)
    buf.Grow(100)

Scrolling more, you have:

     buf := &bytes.Buffer{}

Or, even:

      bytes, err := s.Marshal(o)

So, the first issue: is the buf re-use allowed? That saves alot of allocs. Also, why some starts with 100 bytes, and some doesn't? It's far to compare one that can re-use the space against another that can't (or some random reason)?


Secondly, if "re-use" is allowed, why on Unmarshal it isn't? Looking into the code, most of the implementations are:

        n := rand.Intn(len(ser))
    o := &GogoProtoBufA{}
    err := proto.Unmarshal(ser[n], o)

In that case the o will need to always been recreated. Which puts more pressure on the GC and will create more allocations, because any pointer-field (such as slices) will need to be re-created. Well, that might not be one issue, because the sample is just primitive values (the A struct doesn't have any Field []Something). Depending on your use-case, you might add some sync.Pool and that allocation will be mitigated.


Currently, I'm focusing to port Karmem for more languages (such as C# and Rust, and maybe Haskell and Kotlin). Anyway, after that I'll focus to improve the current tests and adding features and so on. I really want to create one "IDL generator", which will create random schemes, data and code to validate that. That will allow anyone to compare the performance (and more scenarios) and also use for fuzzing. Maybe, for TinyGo, I can pick others serializers from https://github.com/alecthomas/go_serialization_benchmarks and add them into the comparison, running on WebAssembly.

pascaldekloe commented 2 years ago

The benchmarkt should use each API as intended. If there are multiple options, then you should be allowed to pick the fastest option. You are absolutely right about the unfair comparison.

I don't think the object (o := &GogoProtoBufA{}) goes to garbage collection if you call it in a loop without memory retainment.

pascaldekloe commented 2 years ago

OK, I will make all marshalers allocate the buffer. Can you submit a pull request with this format @inkeliz?

pascaldekloe commented 2 years ago

Done and committed to main @inkeliz.