Closed rendaw closed 5 years ago
Build failure unrelated?
Thank you for looking into this! I need to reacquaint myself with the spec and the code a bit, but will try to reply soon, maybe later today.
Your explanation makes perfect sense; I must have missed that part of the spec.
I'm not sure if this would matter in practice, but it seems like to be 100% conformant, we should pad up to 2**code_size
(code_size
being the "minimum" size, before we add 1 for the special codes), rather than stopping at 4?
Right now numColors is the padded size from the color table, so it's guaranteed to be a power of two but its minimum value is 2 rather than 4.
The 4 is kind of a magical number, maybe adding a doc comment to the LzwEncoder constructor specifying that numColors must be a power of two? Or I could make it a constant with a comment on its derivation?
I have an alternate fix in #15
Oh right -- forgot that numColors
was already padded to a power of 2.
I think I prefer your other fix for now, and afterward I can refactor a bit to explicitly derive numColors
from the code size (to make sure it's clear where the minimum comes from), and maybe name it something different as a hint that numColors
isn't necessarily the same as the color table size.
I think this fixes #11.
My understanding is that while the color table has a minumum size of 2, the LZW minimum code size of 2 indicates that the code table has 4 colors, so the code table needs to be padded with two empty colors before the cc and eoi codes. I believe decoders were failing because they expected cc and eoi codes to have different code indexes and were thus misinterpreting the whole stream (first code in the stream is always cc).
I think this is indicated on page 32 here https://www.w3.org/Graphics/GIF/spec-gif89a.txt although I'm bad with specs without lots of pictures and pretty diagrams... it says
so for a code size of 2, the cc index would be 4 (0-1 would be colors, 2-3 ??).
gifencoder currently has 2 as the current minimum for the LZW minimum code size value, and I believe this is correct (i.e. the issue isn't that the minimum code size is being set too high). For reference, the definition says a bit above that that for 2 color images the minimum code size is 2:
I couldn't find information on padding. My understanding is that since the code table is used by matching color table indexes, reusing color index 0 for the padding elements is fine since a lookup for color 0 will never get that far.
In any case I tried this and I could read the output file with a couple programs (not entirely sure they aren't all using the same library).