mitsuhiko / deser

Experimental rust serialization library
https://docs.rs/deser
Apache License 2.0
289 stars 8 forks source link

Improve Performance #34

Open mitsuhiko opened 2 years ago

mitsuhiko commented 2 years ago

The JSON serializer/deserializer currently demonstrates that the performance of the entire system is pretty absymal. Running the same benchmark as with serde/miniserde yields significantly worse results:

running 6 tests
test bench_deserialize_deser_json ... bench:   1,777,264 ns/iter (+/- 93,338)
test bench_deserialize_miniserde  ... bench:     776,057 ns/iter (+/- 49,367)
test bench_deserialize_serdejson  ... bench:     674,177 ns/iter (+/- 2,023)
test bench_serialize_deser_json   ... bench:   1,471,595 ns/iter (+/- 53,628)
test bench_serialize_miniserde    ... bench:     482,137 ns/iter (+/- 46,288)
test bench_serialize_serdejson    ... bench:     317,567 ns/iter (+/- 23,359)
mitsuhiko commented 2 years ago

Unsurprisingly a lot of the perf impact is allocations and deallocations. Compared to miniserde a big source of overhead appears to be Option<T> which in case of deser requires an allocation at all times, whereas miniserde gets away with an unsafe cast of their visitor because it knows that it can always borrow the original visitor. We don't have that luxory.

mitsuhiko commented 2 years ago

The performance is indeed mostly trash because of excessive allocations. It's unclear to me how this could be avoided entirely with this design annoyingly. The only real option I see is to reuse allocations somehow but that might be significantly complicating the implementation. The most frustrating case right now is for sure the fact that Option<T> needs to allocate which is double annoying if the T itself is a sink that allocates. For instance Option<Vec<T>> even if not used at all will allocate at the moment for no good reason.

mitsuhiko commented 2 years ago

I'm not so sure if the design of the library can be optimized much without sacrificing the dynamic dispatch. The only potential option I see is to use a custom allocator on the states. There are a lot of temporary allocations and I see some potential room for improvement. However this also requires to pass the state to functions that currently do not have it. For instance deserialize_into currently does not get the state yet it creates all the boxed sink handles.