DavBfr / dart_barcode

Barcode generation library
https://pub.dev/packages/barcode
Apache License 2.0
132 stars 41 forks source link

Datamatrix SVG bytes are off-by-one from version 2.2.2 onwards #58

Closed whatthehecker closed 5 months ago

whatthehecker commented 1 year ago

First off, thank you for this great library!

Using version 2.2.2 onwards gives different results from previous versions when using toSvgBytes() to generate an SVG image of a datamatrix.

Example (full code at https://github.com/whatthehecker/flutter_barcode_off_by_one):

final Barcode dataMatrix = Barcode.dataMatrix();
Uint8List bytes = Uint8List.fromList(latin1.encode('abcdefghijklmnopqrstuvwxyz').toList());
String svgContent = dataMatrix.toSvgBytes(bytes);

Versions below 2.2.2 return a datamatrix which correctly contains the original string abcdefghijklmnopqrstuvwxyzwhile versions 2.2.2 onwards return all bytes minus 1 (resulting in `abcdefghijklmnopqrstuvwxy).

Datamatrix of 2.2.2 (incorrect): version_2 2 2

Datamatrix of 2.2.1 (correct): version_2 2 1

DavBfr commented 1 year ago

Use dataMatrix.toSvg('abcdefghijklmnopqrstuvwxyz') for textual data. toSvgBytes is for custom binary data.

whatthehecker commented 1 year ago

That seems to do the trick for textual data.

I am still not convinced that this is expected behavior. I would expect a series of bytes to be encoded to the same barcode contents in both versions, no matter what the bytes themselves represent. If I run the code with dataMatrix.toSvgBytes(Uint8List.fromList([48, 65, 97])) (which in this case are ASCII bytes, but this is only so that I can use a generic QR code scanner app to check the contents), I get different outputs depending on the used version.

Am I missing something?

I can see that 2.2.1 calls _encodeText in convert which is itself called from makeBytes, effectively adding 1 to each byte of both binary and textual data in this line: https://github.com/DavBfr/dart_barcode/blob/b0e5ed689c75f55460f62bd74bd9c6f22225ca47/barcode/lib/src/datamatrix.dart#L96

2.2.2 introduces make which is called from toSvg and itself calls DataMatrixEncoder()..ascii(data) for textual data, triggering a version of the above line and adding 1 to each byte. For binary data, convert is called which is now changed to not modify the original bytes, making 2.2.1 and 2.2.2 output a different barcode.

DavBfr commented 1 year ago

Look at that: https://en.wikipedia.org/wiki/Data_Matrix under Encoding

The toSvgBytes lets you write the encoding you want with no changes. But the reader will interpret the result according to the table in the page.

If you use toSvg it uses the Text mode to properly encode characters. But you can't use FNC1 or other control characters.

The DataMatrixEncoder class is a helper to encode the modes and is used to encode text when using toSvg.

whatthehecker commented 1 year ago

I think we are miscommunicating here.

My question is not about whether using toSvgBytes to write textual data is buggy. I understand that it writes the raw bytes without touching them and as such the ASCII values are incorrectly interpreted by a reader since a raw ASCII value is not the correct encoding.

My question is about whether this change between versions is an intentional bugfix. 2.2.1 applies encoding to ASCII characters when using toSvgBytes and does not write the "raw" bytes, I assume 2.2.1 was buggy in this regard. When I originally wrote the code that incorrectly used toSvgBytes instead of toSvg this resulted in the correct string data. This suddenly changed after 2.2.2. Since I did not see anything mentioning this in the release notes, I was simply wondering whether this was an intentional change.

To prevent other people from making the same mistake between choosing the correct method, I think it would be helpful to add a doc comment to toSvgBytes. The current documentation does not emphasize that it writes "raw" values compared to toSvg which does conversion before writing bytes.