micjahn / ZXing.Net

.Net port of the original java-based barcode reader and generator library zxing
Apache License 2.0
2.7k stars 666 forks source link

Possible ShortReads with ITF-Codes #467

Open boomer41 opened 1 year ago

boomer41 commented 1 year ago

In some circumstances, a short read with ITF codes may occur. As ITF codes may contain bars similar to end markers in the middle of the code [1], a quiet zone check needs to be performed. This check is already being performed. This check works for most of the time.

However, we observed some interesting cases when dealing with scanned documents. For reference, the problem below already happened 4 or 5 times in the wild.

The problem

As the decoding algorithm for 1D-Codes is operating only on rows without any reference to rows above/below, a short read can occur when the code is ever so slightly tilted in such a way that the number "6" is the last black pixel block in the row. As 6 is extremely similar to the end marker, the library will short read. The quiet zone check is also passing because the black pixels which should continue the code are not in those rows.

After testing some time, I managed to build an example image [2] that demonstrates this issue. The picture is taken directly from a scanned document. As the decoding algorithm will start in the middle row, the interesting row is exactly centered in this example. The code should read 0002216887, but the library returns 000221.

Solution proposal

Maybe the OneDReader should also check n rows above and/or below, too. When a valid barcode is found on e.g. row j=10, a sanity check shall be performed. The scan shall run on rows (n=5) 5, 6, 7, 8, 9, 11, 12, 13, 14, 15 ([j-n; j+n] \ {j}) to see whether the same barcode can be found there. If more than 50% or 30% of the other rows are also valid and decoded to the same value, the barcode shall be returned. If this check fails, we may be short-reading and we shouldn't return a code. The row handling code will then check some rows above which can then find the code in question.

Reproducing stuff

Tested with latest master branch. Issue first occured on NuGet 0.16.8. We are using the .NET Framework 4.8 version.

The scanner options:

var scanner = new BarcodeReader
{
    AutoRotate = true,
    Options =
    {
        TryHarder = true,
        TryInverted = true,
        PureBarcode = false,
        PossibleFormats = new List<BarcodeFormat>
        {
            BarcodeFormat.ITF
        }
    }
};

References

[1] https://en.wikipedia.org/wiki/Interleaved_2_of_5 [2] example.zip

SanKumSan commented 1 year ago

I have seen some ITF barcodes, which are not decoded even though the barcode is very clear. The barcode is in a Tiff file and it is not decoded .

But, when the same ITF barcode is in the JPG file then it is decoded.

No, idea why the image format of the scanned document, is affecting the ITF barcode.

Any inputs are welcome.