ChenjieXu / pyzxing

Python wrapper of ZXing Java library, making qrcode decoding super easy!
MIT License
150 stars 23 forks source link

Unable to detect QR codes with particular data values #38

Open simon-staal opened 1 year ago

simon-staal commented 1 year ago

System Information

Operating System: Ubuntu 22.04 Python version: 3.10.6

.venv dump (pip freeze) - also contains other QR detection libraries which were tested:

joblib==1.2.0
numpy==1.24.3
opencv-python==4.7.0.72
py4j==0.10.9.7
PyBoof==0.41
pyzbar==0.1.9
pyzxing==1.0.2
segno==1.5.2
six==1.16.0
transforms3d==0.4.1

Detailed description

I was testing the performance of various QR code detection libraries using "perfect" version 1-H QR codes generated by segno, and I noticed that some of these codes could not be detected. In particular, the following 192 values failed to be detected when testing exhaustively in the range $[0, 10^6]$:

1982, 2189, 2429, 4041, 4135, 4598, 4608, 4705, 5468, 6310, 6466, 6620, 6904, 7418, 9983, 10279, 11066, 11561, 13451, 14861, 14940, 15105, 16171, 16697, 17199, 17237,  17367, 18125, 18249, 18251, 18331, 19198, 19295, 19485, 20451, 21842, 22820, 22975, 23030, 23110, 23389, 23502, 23577, 24380, 24403, 24537, 24836, 25114, 25393, 25481, 26545, 26668, 26783, 26872, 27075, 27649, 29342, 29744, 29759, 30473, 31283, 32182, 33687, 33982, 34427, 35288, 36713, 36731, 37872, 38155, 38630, 39141, 39892, 40506, 41855, 42022, 42536, 42918, 43036, 43452, 44255, 44300, 47121, 47681, 48830, 48942, 49808, 49992, 50922, 52182, 53588, 54099, 54441, 54635, 55294, 55540, 55802, 56831, 57003, 57950, 58161, 58240, 58815, 60599, 60826, 60915, 61078, 61569, 61612, 62202, 62457, 62710, 63392, 64002, 64632, 65289, 65742, 66070, 66175, 66495, 67268, 67345, 67861, 68142, 69169, 69772, 69991, 70558, 70585, 72597, 72769, 73250, 73726, 74664, 74729, 75836, 76625, 77483, 77574, 77762, 77906, 78091, 78339, 78350, 78626, 78678, 78889, 78922, 79267, 79726, 79889, 80122, 80483, 80615, 80680, 80886, 80930, 81243, 81315, 81862, 82485, 82509, 82600, 84801, 85762, 86122, 87319, 87729, 89386, 89478, 89996, 90458, 90709, 90938, 91397, 91541, 92479, 93120, 93287, 94030, 94865, 95290, 95342, 95377, 95700, 95815, 96620, 97158, 98549, 98665, 99579, 99794

This issue was not exclusive to pyzxing, and other tested libraries such as OpenCV and BoofCV also had issues with some payloads (although for different values), which are confirmed as a bug due to false positives in the finder pattern (which may or may not be the same issue in this case) - see lessthanoptimal/PyBoof#23. Please note that there is also an identified bug in segno in which the padding is malformed for 4-digit payloads - see heuer/segno#123, although this does not seem to affect pyzxing as I'm assuming you ignore the padding bits (which seems to be a strategy employed by many detector libraries).

Steps to reproduce

import cv2
import segno
from pyzxing import BarCodeReader

TEST_FILE = 'qr_test.png'

reader = BarCodeReader()
bad_detections = []

for i in range(0, 100000):
    qrcode = segno.make(i, version=1)
    assert qrcode.error == 'H'
    qrcode.save(TEST_FILE, scale = 5)

    img = cv2.imread(TEST_FILE)
    img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    img = cv2.resize(img, (88, 88))

    barcodes = reader.decode_array(img)
    assert len(barcodes) == 1
    if 'raw' not in barcodes[0]:
        print(f"Unable to detect {i}")
        bad_detections.append(i)
    else:
        assert barcodes[0]['raw'].decode("utf-8") == str(i), f"Mismatch between detected value {barcodes[0]['raw'].decode("utf-8")} and input {x}"

print(f"Unable to detect the following data payloads:\n{bad_detections}")
legut2 commented 3 months ago

This is really interesting. Thanks for sharing @simon-staal . I was running into issues detecting and was scratching my head about it.