Closed deneonet closed 3 months ago
I think it's fine to remove reused buffers, but where applicable the serialisers should still be using preallocated buffers rather than letting the slices grow. The rationale is that if the encoder supports using pre-allocated buffers, the benchmarks should take advantage of them.
I mean it's just easier to not allow buffer reusing, than writing multiple benchmarks just for buffer-reusing. There were also complains about it. But letting the slice grow on the baseline serializer, just kills the whole purpose of it, true. Let me fix it.
Reusing already allocated and correctly sized buffers is one of the strategies to achieve high marshal performance on most serializers, and if a serializer does not support that, it necessarily can't achieve the best possible performance on several workloads (for example: a tight loop that serializes sequential values).
If a serializer doesn't support that, it should be obvious, either through the benchmark results or through an annotation of some sort.
There were also complains about it
What complaints?
Here a pull request, which changed, that all serializer allocate memory (so no buffer-reuse) and here an issue, that was opened to remove the buffer-reuse from my own serializer. It should be made clear, whether buffer reusing is allowed or not.
Also, for example, bebop, doesn't support buffer reusing natively, it was added using a 'buf' field in the serializer struct. Now my question: Why didn't you add the buffer field on every serializer that supports it, like MUS or benc?
Bebop does support buffer reuse natively, that's why it has a MarshalBebopTo
method and it's docs suggest it's use in this way.
Doesn't MarshalBebopTo
require a buffer as argument?
Quoting from the original issue:
Basically, I don't have an issue with any particular optimisations as long as it's very clear what they are and what the tradeoffs are.
Measuring with buffer reuse for the serializers that support is important to me, but I can see how that may different for users.
I see 2 possible solutions here:
/bufferReuse
alternate benchmark for the serializers that do support it (same as the unsafe variants).I would be for the first solution, to also be able to compare benchmarks without buffer reuse.
I would be for the first solution, to also be able to compare benchmarks without buffer reuse.
Agreed, and this echoes my original statement. I do think that currently it is not obvious which serialisers are either using preallocated buffers, or reusing buffers. We should be consistent about that too.
I would be for the first solution, to also be able to compare benchmarks without buffer reuse.
I also think this is a best solution. But now, in the Totals table for each serializer, we have two records: marshal and unmarshal. Is this correct? If so, for a serializer that supports unsafe and reuse, we will have 8 records:
Isn't it too much? We could have only 4:
I also think this is a best solution. But now, in the Totals table for each serializer, we have two records: marshal and unmarshal.
See the new HTML report card: https://htmlpreview.github.io/?https://github.com/alecthomas/go_serialization_benchmarks/blob/master/report/index.html
It aggregates marshal/unmarshal into a single line, so there is (e.g.) BENC and BENC/unsafe. With reuse, there would be either one or two additional lines (BENC, BENC/reuse, BENC/unsafe, BENC/unsafe/reuse).
The table doesn't show up for me, just the metrics text.
Which browser are you using? Is JS enabled?
Arc, I think that Js is enabled, otherwise any other site wouldn't work.
On chrome is also doesn't work, on Floorp though. On Vivaldi it doesn't work too, I think it doesn't work on all chromium-based browsers.
Some new form of blocking on chrome... :/
Will see if I can address it later.
Some serializer structs had buffer, which were reused, so I removed them + I updated benc to the latest version.