image-js / tiff

TIFF image decoder written entirely in JavaScript
MIT License
195 stars 17 forks source link

TIFF LZW style - When to change the code length #36

Open lpatiny opened 3 years ago

lpatiny commented 3 years ago

Looking at the spec: https://www.fileformat.info/format/tiff/corion-lzw.htm

The function GetNextCode() retrieves the next code from the LZW- coded data. It must keep track of bit boundaries. It knows that the first code that it gets will be a 9-bit code. We add a table entry each time we get a code, so GetNextCode() must switch over to 10-bit codes as soon as string #511 is stored into the table. We need to change the code length as soon at #511 is stored.

However in the code the change is done at 510:

https://github.com/image-js/tiff/blob/73ca97100c0674855db1be3aa71bb5385785a09c/src/lzw.ts#L94-L96

I don't know what is the correct version but the confusion could be due to TIFF version:

https://stackoverflow.com/questions/26366659/whats-special-about-tiff-5-0-style-lzw-compression

The current implementation in 'debug-lzw' branch seems however correct based on the lzw images we have and the comparison with convert from imagemagick.

https://github.com/image-js/tiff/commit/199aa6da9d01df5d5a432339587c3d57aa83a5ec

targos commented 3 years ago

However in the code the change is done at 510:

I think it's just because #511 is if you count from one, and we count from zero.