Closed nico closed 1 year ago
I kind of love that you're looking at the profile byte by byte. Short answer is the 0021
is intended to be padding, and I chose to pad with the UTF-16BE value of !
simply because bangs are more interesting than nulls.
I honestly had forgotten I did that when assembling the profile, which I did manually in a hex editor. I copied that tag into all my v4 profiles, so they've all got it.
According to the ICC v4 spec, section 7.1.2 d) all pad bytes shall be NULL (as defined in ISO/IEC 646, character 0/0)
, so this was strictly a violation of the spec. The reason for the restriction is given in a note that follows:
the likelihood of two profiles which contain the same tag data, yet have different checksum values, is reduced
Certainly allowing pad bytes to contain random/uninitialized data would be against the spirit of the rule, but I figured a single bang wouldn't do any harm 😇
I discovered it because I wrote some code that reads and re-serializes an ICC profile. My code uses null bytes for padding, so my profile was different to yours in that one byte, and due to that, in the profile ID. I used your profile as test input and then was confused for a bit that the output didn't match the input.
Thanks for pointing out that the spec requires null padding bytes, I hadn't seen that yet. Maybe it'd be nice to update your profiles to be spec-compliant then :)
Yeah, honestly I must have missed that note in the spec my first time through it. There was no such requirement in the v2 spec. Certainly if I had known these profiles would get as much use as they have, I would have gone with plain null padding regardless.
It seems reasonable to update the profiles given the spec is clear about the padding rules.
All v4 profiles have been updated with null padding bytes in https://github.com/saucecontrol/Compact-ICC-Profiles/commit/ca0a34b4047e3ebdefc0196759a4d88b85b104b4
This is more a comment than really a bug. It just confused me for a bit. If this is all intentional, feel free to just close it.
The first row is 'mluc', the reserved 0 u32, the number of records u32 (1), and the record size u32 (12), so far so good.
The second row is 2 + 2 bytes for enUS, then an u32 "string size in bytes" (6), then an u32 string data offset, then 4 utf-16 characters 'CC0!'. It's 4 character since tags must be 4-byte aligned, so that's the smallest-possible text.
However, the "string size in bytes" is 6, not 8, so tools think the text is "CC0". That might be intentional to match the v2 profiles, where cprt is stored in a 'text', which must be nul-terminated, so there's only room for "CC0".
And indeed, further up there's
which says that the cprt data is only 0x22 bytes long – it ends after the '0030'. So the "0021" is padding – but if it's padding, why isn't it "0000" (or "2121")?