fukuchi / libqrencode

A fast and compact QR Code encoding library
https://fukuchi.org/works/qrencode/
GNU Lesser General Public License v2.1
2.57k stars 599 forks source link

Questionable interpretation of mask scoring criterion number 3 (finder pattern) #220

Open whitslack opened 7 months ago

whitslack commented 7 months ago

One reason why libqrencode chooses a different mask than other QR encoders (related: #115, #192) may be that its calculation of mask scores entails a questionable interpretation of scoring criterion number 3 (existence of finder pattern inside the QR code).

ISO/IEC 18004:2015 “Information technology — Automatic identification and data capture techniques — QR Code bar code symbology specification,” § 7.8.3.1 “Evaluation of QR Code symbols,” Table 11, Note 3, reads:

If the light area of more than 4 module wide exists after or before a 1:1:3:1:1 ratio (dark:light:dark:light:dark) pattern, the imposed penalty shall be 40 points.

There are at least three points of ambiguity in this specification:

  1. Does the penalty apply to a “light area of more than 4 module wide” (i.e., strictly >4), as stated in Note 3, or does it apply to a “light area 4 modules wide” (i.e., ≥4), as stated in Table 11?

  2. Does the quiet zone around the QR Code qualify as a “light area of more than 4 module wide”? Libqrencode assumes that it does, and this causes it to assess six N3 demerits for each legitimate finder pattern in the QR Code — three horizontally and three vertically. The problem with this is that it then means libqrencode does not assess an additional N3 demerit when four light modules occur after a finder pattern (i.e., toward the interior of the QR Code, away from the quiet zone). In other words, because libqrencode already flags all of the legitimate finder patterns as failing the N3 test due to the surrounding quiet zone, it does not notice when any of those patterns also has four light modules following it (toward the interior), and thus it is unable to discriminate between maskings that introduce such false quiet zones and maskings that do not.

    The quiet zone is specified to be exactly the width of 4 modules, so if the N3 test is meant to apply strictly to a “light area more than 4 module wide,” then the quiet zone does not meet that requirement. Alternatively, if the specification intended, “If 4 light modules exist after or before a 1:1:3:1:1 ratio…”, then the quiet zone does not meet that requirement either, as it has a width of 4 modules but does not contain any modules.

    It is reasonable to believe that the specification did not intend for the quiet zone to be interpreted as part of the QR Code for purposes of the N3 test, as the intent seems to be to demerit maskings that introduce a false quiet zone on the wrong side of a finder pattern. If the real quiet zone causes every finder pattern to be flagged already, as happens in libqrencode, then the ability to demerit false quiet zones adjacent to the real finder patterns is lost.

  3. If we disregard the quiet zone, then another ambiguity arises in the wording of the specification. Is the N3 penalty to be imposed for each occurrence of the finder pattern found in the masking, or is it to be imposed once if the pattern exists anywhere in the masking? The specification is clear that the N1 and N2 penalties apply to each occurrence of the relevant bad pattern, with Note 1 mentioning, “The rule of this calculation is that 3 penalty points shall be added to each block of five consecutive modules, 4 penalty points for each block of six consecutive modules and so on…” (emphasis added), and Note 2 stating, “The penalty point shall be equal to the number of blocks with 2 x 2 light or dark modules” (emphasis added). However, Note 3 uses neither the word “each” nor the word “number” but does use the word “exists.”

    Given that the N3 penalty of 40 points is far and away the greatest of the four defined penalties, there is a distinct possibility that the specification intended for it to be imposed only once if a masking fails the N3 test at all. Nevertheless, as it is somewhat uncommon for maskings to produce finder patterns (again, assuming we are disregarding the quiet zone), it is not as important whether an implementation assesses an N3 penalty once per pattern occurrence or once for the whole masking.

I believe that the best path forward for libqrencode would be to disregard the quiet zone for purposes of tallying the N3 demerits. This would allow it to properly demerit maskings that introduce false quiet zones adjacent to the actual finder patterns and could bring its mask choices more into agreement with those of other encoders.

unixdj commented 4 months ago

2: It does include the quiet zone, but finder patterns may overlap. So if ----x-xxx-x---- occurs in the middle of the code, it counts as two patterns.

3: Each.

Disregarding finder patterns extending into the quiet zone defeats the purpose. Each code starting with 18 penalties of 40 points for finder patterns is not an issue, just like it's not an issue that each pattern has 4 black boxes, 4 white 5-pixel runs, 4 black 7-pixel runs and 2 white 8-pixel runs.

whitslack commented 4 months ago

So if ----x-xxx-x---- occurs in the middle of the code, it counts as two patterns.

@unixdj: If libqrencode actually counted it that way, then that would be fine, and what you're saying about all codes starting out with the same number of demerits for the standard-mandated patterns would indeed imply that it does not matter whether the quiet zone is treated as though it contained light modules. However, the problem specifically in libqrencode is that it doesn't demerit ----x-xxx-x---- twice, so it is failing to properly demerit a run of four light modules that occurs adjacent to a genuine finder pattern. Effectively "not noticing" when a run of four light modules occurs in the interior of the code, adjacent to a finder pattern, means libqrencode can sometimes select a suboptimal mask.