madler / zlib

A massively spiffy yet delicately unobtrusive compression library.
http://zlib.net/
Other
5.75k stars 2.47k forks source link

ARM specific optimizations #216

Open Adenilson opened 7 years ago

Adenilson commented 7 years ago

Libpng has both intrinsics and hand written ASM code for ARM (on the pre-filters).

Would zlib be open to contributions of a few core/hot functions targeting ARM?

One good candidate we identified is Adler-32, a SIMD version is about 3x faster on ARMv8.

Adenilson commented 7 years ago

I recently implemented a NEON-ized version of Adler32 checksum (https://codereview.chromium.org/2676493007/, about 3x faster on ARMv8) and I'm looking forward to upstream this patches instead of forking even more the zlib used in Chromium.

Adenilson commented 7 years ago

Please see pull request at: https://github.com/madler/zlib/pull/251

Adenilson commented 7 years ago

Next candidate would be CRC32 (it can be made 7x to 10x faster by using the CRC32 instruction available in ARMv8).

https://bugs.chromium.org/p/chromium/issues/detail?id=709716

AraHaan commented 7 years ago

If it can be made faster just do it. Anyone could easily make great use of it being faster with performance increase.

Adenilson commented 7 years ago

zlib is both efficient and fast (not to mention insanely portable) and has provided great services for the world for the last 2 decades. It is used everywhere: Linux kernel, Chromium, Firefox, libpng, iOS, Android, etc.

We all should be grateful that it was made available for free by their authors.

Adenilson commented 7 years ago

One way to improve performance is by sacrificing portability (e.g. CPU specific code), which is a considerable cost and it is better to keep it contained in well separated functions/modules.

Adenilson commented 7 years ago

@timofonic zlib-ng has accepted the ARM specific optimizations, IIRC it is in a development branch.