emmanuel-marty / lzsa

Byte-aligned, efficient lossless packer that is optimized for fast decompression on 8-bit micros
Other
236 stars 30 forks source link

Rewrite 8088 jumptable decompressor for maximum speed #35

Closed MobyGamer closed 5 years ago

MobyGamer commented 5 years ago

This is a rewrite of LZSA1JMP.ASM to use a 256-element jumptable, which allows the code to handle all of the hot paths (common cases) without any branching. This not only reduces branches (which are very costly on x86) to a bare minimum, but also grants us foreknowledge in a decode path of what steps can be skipped.

The new code is 12.7% faster than the old code, and assembles to less than 3K of object code and data.