powturbo / Turbo-Base64

Turbo Base64 - Fastest Base64 SIMD:SSE/AVX2/AVX512/Neon/Altivec - Faster than memcpy!
GNU General Public License v3.0
264 stars 40 forks source link

MB/s for decoder is misleading #11

Closed ilyakurdyukov closed 2 years ago

ilyakurdyukov commented 2 years ago

Which shows MB/s of the input data, rather than output.

And you doesn't clarify that anywhere! But clarify that Kb means 1000.

Are you also calculating MB/s for decompressors in terms of input size? So if you compress gigabytes of zeros, then the performance of the decompressor will be around zero (because the input will be extremely small).

If we imagine a base64 modification in which encoding and decoding take the same amount of time, then your benchmark will show that decoding is 1.333 times faster.

So I think it's wrong.

powturbo commented 2 years ago

You're right, but this this method is also used by other base64 libraries. Originally, I had considered the raw size (input or output) to calculate the encoding/decoding speed (this is the usual way). I have changed this method to make Turbo-Base64 comparable to other libraries like for ex. https://github.com/aklomp/base64 The decoding input cannot be extremlly small even if you compress gigabytes of zeros, it is always 133,3% larger than the raw data.

ilyakurdyukov commented 2 years ago

In a provided example (https://github.com/aklomp/base64), the results aren't compared to the memcpy. That's what makes the big difference.

I'm angry that you are claiming that your decoding speed from your library is close to memcpy. But memcpy copies raw data.

But if you feed the encoded data to the base64 decoder, you only get 75% of it in return.

So when you say things like "Fastest AVX2 implementation, damn near to memcpy" and shows results like this:

26032 TB64avx2 29992 memcpy

Then many people might think that if they are using base64 instead of binary data and use your library, then the downside is only:

29992 / 26032 = 1.152 times slower

But this's a mistake, because there's more data to be processed using the decoder. So the real downside to using base64 would be:

29992 / (26032 * 0.75) = 1.536 times slower

That's a huge difference, and it's not "damn near to memcpy". And the way the results are displayed in your benchmark leads to this mistake.

So to prevent this mistake, you should measure to the raw data, or at least insert a note or warning about that into the readme. It would be fair.

powturbo commented 2 years ago

The comparison to memcpy and all the benchmark results are provided only as orientation. Base64 is different to memcpy as the input and output buffer are not the same length. This is common to all compression/encoding benchmarks. Nobody cares, the most important is the comparison to other base64 libraries and here Turbo-Base64 is the fastest base64 library.

ilyakurdyukov commented 2 years ago

Fine, I'll write my benchmark. If the majority do something wrong, it doesn't mean that I should follow the majority.