Closed GoogleCodeExporter closed 8 years ago
I'll handle this one.
Original comment by yann.col...@gmail.com
on 30 Jan 2012 at 9:44
Good point. ARM performance requires an ARM testbed for proper evaluation.
It costed me a lot of time to build one, but i guess i've got an almost working
one today. I will look into it.
Original comment by yann.col...@gmail.com
on 30 Jan 2012 at 9:45
I'm using the BeagleBone running Ubuntu 11 as my testbed. Under US$100.
James
Original comment by caprifin...@gmail.com
on 30 Jan 2012 at 8:15
As a quick test, you may want to "disable" code extra-precaution for
strictly-aligned memory access CPU (such as many ARM) if *and only if* your ARM
board does indeed support unaligned access.
I made a test with an ARM Cortex A8, which apparently supports unaligned
access, and it increased speed by almost 50%.
To disable the precaution, it's enough to modify these lines :
#ifdef __GNUC__
//#define _PACKED __attribute__ ((packed))
#define _PACKED
#else
#define _PACKED
#endif
Regards
Original comment by yann.col...@gmail.com
on 1 Feb 2012 at 10:37
A proposed release candidate has been sent to your email.
It adds 2 features which may be of interest for your use case :
1) ARM and Unaligned Memory Access :
By default, LZ4 is very cautious with ARM processors, and entirely avoids the
“unaligned memory” problem.
However, some newer ARM cpus are now able to handle properly unaligned memory
access.
This makes a critical performance difference.
However, this new feature is not automatically discovered by today’s
compilers, or i guess by most compilers.
A very recent pre-defined macros has been contributed by ARM to GCC, called
__ARM_FEATURE_UNALIGNED.
I’ve integrated it, but unfortunately, it is too recent to be properly
supported by current crop of compilers, maybe next generation.
Therefore, the only way to benefit this feature is to manually instruct the
code to use it.
This can be done more easily now, with the following lines :
// Unaligned memory access ?
// This feature is automatically enabled for "common" CPU, such as x86.
// For others CPU, you may want to force this option manually to improve
performance if your target CPU supports unaligned memory access
#if (__ARM_FEATURE_UNALIGNED)
#define CPU_UNALIGNED_ACCESS 1
#endif
You can force the detection to “1”, and it will gladly use unaligned memory
access.
On the ARM Cortex A8 used for test, it resulted in a 50% performance increase.
On processors which do not support unaligned memory access, it will crash.
2) Incompressible segments detection
LZ4 can skip over incompressible segments.
It is more cautious than LZO in doing so.
Especially LZO 1x_1, the skipping is so strongth that it can quickly go through
a perfectly compressible large file on the ground that a short segment was not.
Now, for very small packets, this weakness becomes a strength.
You can instruct LZ4 to skip incompressible segments faster, by lowering the
confirmation level.
It is only necessary to modify this figure :
// NONCOMPRESSIBLE_CONFIRMATION :
// Increasing this value will make the algorithm search more before declaring a
segment "incompressible"
// This could improve compression a bit, but will be slower on incompressible
data
// Decreasing this value will make the algorithm declare its current segment
"incompressible" much faster
// This may decrease compression ratio dramatically, but will be faster on
incompressible data
// The default value (6) is recommended
#define NONCOMPRESSIBLE_CONFIRMATION 6
Finding the “optimal” value is a matter of use-case and test samples.
Best Regards
Original comment by yann.col...@gmail.com
on 2 Feb 2012 at 9:22
r54 has been published. It integrates the capability to manually force
"unaligned memory access" on ARM processors which support it.
Original comment by yann.col...@gmail.com
on 7 Feb 2012 at 5:04
option added in r54
Original comment by yann.col...@gmail.com
on 8 Feb 2012 at 12:33
Original issue reported on code.google.com by
caprifin...@gmail.com
on 30 Jan 2012 at 9:24