agalakhov / captdriver

Driver for Canon CAPT printers
GNU General Public License v3.0
137 stars 59 forks source link

Is Hi-SCoA an LZ77 State Machine, and meaning of POSx in LONGREPx? #60

Open mounaiban opened 4 weeks ago

mounaiban commented 4 weeks ago

Hi again, I have been studying the Hi-SCoA codec, and I am starting to understand it somewhat, but some things are still flying over my head. I've got two questions for now:

Q1

Just double-checking: when you say LONGREPx commands "copy prefix+N bytes from output position POSx", do you mean "go POSx bytes backwards from end of output, then copy prefix+N bytes, from there, to after end of output"?

Q2

From what I can figure, Hi-SCoA seems like an LZ77 implementation that uses a stack and registers. Instead of using dictionary keys in the form of backwards offsets and lengths, Hi-SCoA seems to be using commands for lookups (PREFIX, LONGREPx, REPBYTE), while keeping offsets in the registers (L2, L3, L4, L5).

It also seems to use the stack for single bytes, saving the classical LZ77 look-back for longer strings.

Would it be wrong to call Hi-SCoA "LZ77 implemented as a (Finite) State Machine"?

agalakhov commented 4 weeks ago

Hi,

IIRC, Hi-SCoA uses absolute offsets, that is, relative to the beginning of the buffer being decompressed. But I may be wrong, it was many years ago. In fact, I have a working Hi-SCoA decompressor that may be used as reference: https://github.com/agalakhov/anticapt/blob/master/hiscoa-decompress.c

Yes, it is some kind of LZ77-family algorithm. For the very first version of the compressor I didn't even reverse-engineerit completely. I just figured out how to use it in RLE-like mode and found that the printer is pretty happy with it.