Currently its static coding optimized for binaries. Not optimal for text. May be I should implement second static coding for text (which will be almost but not exactly identical) and implement some way to switch between them.
1.2. Offset coding.
Uses custom coding that is similar to Exponential-Golomb coding (but more complicated, actually). Current parameters in lzoma.h were hand tuned on a few files. Should get statistics on more files, then write some code to auto-tune them for average case.
1.3. flags, literals. Currently not compressed at all.
Looking at zstd speed, it may be interesting to try use similar FSE coder for literals and flags and some parts of offsets. An interesting possibility is to combine flags, literals, some bits from length and offsets into single 512-symbol alphabet, so that I will have to FSE_decode only one symbol per item (literal or match), then read more bits directly if needed.
Testing shows even with simple order0 coders it is possible to beat brotli and be very close to xz.
Problem: need to know exactly how many bits needed to encode. One possible solution is to do fast first pass to estimate statistics, then second pass with optimal parsing for actual compression. Then it may become possible to do 3rd,4th passes to refine statistics, ... like glza... but unlike glza at least it can be decompressed without lots of extra memory.
Filter notes
2.1. Text compression.
lzoma currently works really well on xwrt-transformed enwik8.xwrt. but decompresssion takes 0.3 s lzoma back into enwik8.xwrt then 1.5s to undo xwrt transform, so undoing filter is 5x slower than decompression itself. Should check which filters provide most improvement and can be implemented with minimal code with fast backward transform. May be there is some way to use them on text parts in binary files.
2.2. Binary compression.
e8e9 filter is not best for x86. flt32 by Dmitiry Shkarin is much better. Problem: do not know yet how to efficiently do such transform in-place.
delta transform from FreeArc is also useful for linux kernel at least. Problem: code for undoing it is not too fast and not small enough for executable compression.
Speed
gcc produces bloated code on all platform I know of, especially considering code size. Should implement asm unpackers at least for arm, x86, x86-64.
Entropy coding notes.
1.1. Length coding
Currently its static coding optimized for binaries. Not optimal for text. May be I should implement second static coding for text (which will be almost but not exactly identical) and implement some way to switch between them.
1.2. Offset coding.
Uses custom coding that is similar to Exponential-Golomb coding (but more complicated, actually). Current parameters in lzoma.h were hand tuned on a few files. Should get statistics on more files, then write some code to auto-tune them for average case.
1.3. flags, literals. Currently not compressed at all.
Looking at zstd speed, it may be interesting to try use similar FSE coder for literals and flags and some parts of offsets. An interesting possibility is to combine flags, literals, some bits from length and offsets into single 512-symbol alphabet, so that I will have to FSE_decode only one symbol per item (literal or match), then read more bits directly if needed. Testing shows even with simple order0 coders it is possible to beat brotli and be very close to xz. Problem: need to know exactly how many bits needed to encode. One possible solution is to do fast first pass to estimate statistics, then second pass with optimal parsing for actual compression. Then it may become possible to do 3rd,4th passes to refine statistics, ... like glza... but unlike glza at least it can be decompressed without lots of extra memory.
Filter notes
2.1. Text compression.
lzoma currently works really well on xwrt-transformed enwik8.xwrt. but decompresssion takes 0.3 s lzoma back into enwik8.xwrt then 1.5s to undo xwrt transform, so undoing filter is 5x slower than decompression itself. Should check which filters provide most improvement and can be implemented with minimal code with fast backward transform. May be there is some way to use them on text parts in binary files.
2.2. Binary compression.
e8e9 filter is not best for x86. flt32 by Dmitiry Shkarin is much better. Problem: do not know yet how to efficiently do such transform in-place. delta transform from FreeArc is also useful for linux kernel at least. Problem: code for undoing it is not too fast and not small enough for executable compression.
Speed
gcc produces bloated code on all platform I know of, especially considering code size. Should implement asm unpackers at least for arm, x86, x86-64.