emmanuel-marty / lzsa

Byte-aligned, efficient lossless packer that is optimized for fast decompression on 8-bit micros
Other
233 stars 30 forks source link

Reorganize 6502 decompress_faster depackers for smaller size and grea… #59

Closed jbrandwood closed 3 years ago

jbrandwood commented 3 years ago

Hi!

Here are updated versions of my 6502 "decompress_faster" depackers.

Both main loops have been reorganized to optimize the code flow and status flag usage.

decompress_faster_v1.asm is 1 byte shorter for the "small" version, and 13 bytes shorter for the "fast" version. Decompression speed is basically unchanged.

decompress_faster_v2.asm is 1 byte shorter for the "small" version, and 12 bytes shorter for the "fast" version. Decompression speed for the "fast" version is nearly 3% better.

FYI, I have removed the "!if (LZSA_NO_INLINE | LZSA_USE_FFFF)" branch optimizations from the code, because after some testing, they add 10 bytes, i.e. nearly 4%, to the code, but they only provide about a 1% improvement in decompression speed ... and the new code is already faster than that, while now being (just) shorter than a page long.

Best wishes,

John