utf8 text in qrcodes was not handled properly

csm101 commented 8 years ago

I created a barcode containing utf8 characters taken from both the cyrillic and european charset (to be sure that they can't be all casted to ansi characters without seeing any data loss: your ansi charset can show only one of the two), and I added this barcode to the unit tests.

The test was failing: the string was surely being casted to a 8 bit ansi charset, losing any character that can't be represented in the local 8 bit charset (as I suspected this would happen because there was a compiler warning about this happening). the warining was in in TReadResult.SetText.

The suspiscious code was the call to UTF8ToString(AValue) before setting the FText value.

I put a breakpoint there and I did see that AValue was already containing te correct unicode string with all the characters I was expecting to read from the QR-CODE, so there was no need to do any conversion.

Surely the directly Assigning AValue to FText, without any transformation solved the bug under Win32 and Win64 compilers. I don't know if that call to UTF8ToString was needed for Android/Ios: I have never developed with next gen compilers and I can't test it.

Before applying this pull request could you please try to scan the barcode I added with ZXIng compiled for android/ios? you should get this string: "ру́сский язы́к, russkij jazyk èàòù"

Note: I edited the literal strings in the test cases in order to use escape sequences with unicode values of characters instead of leaving the accented letters in the source: this could give problems to programmers who don't use a west-european windows installation. Another option could be to save the sources in UTF-8 format instead of using ANSI...

Spelt commented 8 years ago

Hi Carlo,

Nice one, weird that one of the Russian committers did not catch this one :-)

Android: Confirmed OK IOS 32 Bit : Confirmed OK IOS 64 Bit : Confirmed OK

Op 30 aug. 2016, om 13:56 heeft Carlo Sirna notifications@github.com het volgende geschreven:

I created a barcode containing utf8 characters taken from both the cyrillic and european charset (to be sure that they can't be all casted to ansi characters without seeing any data loss: your ansi charset can show only one of the two), and I added this barcode to the unit tests.

The test was failing: the string was surely being casted to a 8 bit ansi charset, losing any character that can't be represented in the local 8 bit charset (as I suspected this would happen because there was a compiler warning about this happening). the warining was in in TReadResult.SetText.

The suspiscious code was the call to UTF8ToString(AValue) before setting the FText value.

I put a breakpoint there and I did see that AValue was already containing te correct unicode string with all the characters I was expecting to read from the QR-CODE, so there was no need to do any conversion.

Surely the directly Assigning AValue to FText, without any transformation solved the bug under Win32 and Win64 compilers. I don't know if that call to UTF8ToString was needed for Android/Ios: I have never developed with next gen compilers and I can't test it.

Before applying this pull request could you please try to scan the barcode I added with ZXIng compiled for android/ios? you should get this string: "ру́сский язы́к, russkij jazyk èàòù"

Note: I edited the literal strings in the test cases in order to use escape sequences with unicode values of characters instead of leaving the accented letters in the source: this could give problems to programmers who don't use a west-european windows installation. Another option could be to save the sources in UTF-8 format instead of using ANSI...

You can view, comment on, or merge this pull request online at:

https://github.com/Spelt/ZXing.Delphi/pull/39 https://github.com/Spelt/ZXing.Delphi/pull/39 Commit Summary

added utf8 test for barcode containing mixed european and russian characters. it failed. fixed bug File Changes

M Lib/Classes/Common/ZXing.ReadResult.pas https://github.com/Spelt/ZXing.Delphi/pull/39/files#diff-0 (4) A UnitTest/Images/utf8-test.png https://github.com/Spelt/ZXing.Delphi/pull/39/files#diff-1 (0) M UnitTest/Test.pas https://github.com/Spelt/ZXing.Delphi/pull/39/files#diff-2 (18) Patch Links:

https://github.com/Spelt/ZXing.Delphi/pull/39.patch https://github.com/Spelt/ZXing.Delphi/pull/39.patch https://github.com/Spelt/ZXing.Delphi/pull/39.diff https://github.com/Spelt/ZXing.Delphi/pull/39.diff — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Spelt/ZXing.Delphi/pull/39, or mute the thread https://github.com/notifications/unsubscribe-auth/AMEt1TuBS1L46aB90Ips_xYysZshuiQ6ks5qlBpbgaJpZM4JwcZN.

csm101 commented 8 years ago

Well, you can notice this only if you use characters that can't be converted in your local ansi charset: if you are russian and you use only russian characters, all the characters you use will be preserved if you cast an unicode string to your local ansi charset. that's why I mixed characters from different charset.

Is it working correctly also under android?

2016-08-31 10:34 GMT+02:00 E Spelt notifications@github.com:

Hi Carlo,

Nice one, weird that one of the Russian committers did not catch this one :-)

Android: Confirmed OK IOS 32 Bit : Confirmed OK IOS 64 Bit : Confirmed OK

Op 30 aug. 2016, om 13:56 heeft Carlo Sirna notifications@github.com het volgende geschreven:

I created a barcode containing utf8 characters taken from both the cyrillic and european charset (to be sure that they can't be all casted to ansi characters without seeing any data loss: your ansi charset can show only one of the two), and I added this barcode to the unit tests.

The test was failing: the string was surely being casted to a 8 bit ansi charset, losing any character that can't be represented in the local 8 bit charset (as I suspected this would happen because there was a compiler warning about this happening). the warining was in in TReadResult.SetText.

The suspiscious code was the call to UTF8ToString(AValue) before setting the FText value.

I put a breakpoint there and I did see that AValue was already containing te correct unicode string with all the characters I was expecting to read from the QR-CODE, so there was no need to do any conversion.

Surely the directly Assigning AValue to FText, without any transformation solved the bug under Win32 and Win64 compilers. I don't know if that call to UTF8ToString was needed for Android/Ios: I have never developed with next gen compilers and I can't test it.

Before applying this pull request could you please try to scan the barcode I added with ZXIng compiled for android/ios? you should get this string: "ру́сский язы́к, russkij jazyk èàòù"

Note: I edited the literal strings in the test cases in order to use escape sequences with unicode values of characters instead of leaving the accented letters in the source: this could give problems to programmers who don't use a west-european windows installation. Another option could be to save the sources in UTF-8 format instead of using ANSI...

You can view, comment on, or merge this pull request online at:

https://github.com/Spelt/ZXing.Delphi/pull/39 https://github.com/Spelt/ ZXing.Delphi/pull/39 Commit Summary

added utf8 test for barcode containing mixed european and russian characters. it failed. fixed bug File Changes

M Lib/Classes/Common/ZXing.ReadResult.pas https://github.com/Spelt/ ZXing.Delphi/pull/39/files#diff-0 (4) A UnitTest/Images/utf8-test.png https://github.com/Spelt/ ZXing.Delphi/pull/39/files#diff-1 (0) M UnitTest/Test.pas https://github.com/Spelt/ ZXing.Delphi/pull/39/files#diff-2 (18) Patch Links:

https://github.com/Spelt/ZXing.Delphi/pull/39.patch < https://github.com/Spelt/ZXing.Delphi/pull/39.patch> https://github.com/Spelt/ZXing.Delphi/pull/39.diff < https://github.com/Spelt/ZXing.Delphi/pull/39.diff> — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub < https://github.com/Spelt/ZXing.Delphi/pull/39>, or mute the thread < https://github.com/notifications/unsubscribe-auth/AMEt1TuBS1L46aB90Ips_ xYysZshuiQ6ks5qlBpbgaJpZM4JwcZN>.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Spelt/ZXing.Delphi/pull/39#issuecomment-243696362, or mute the thread https://github.com/notifications/unsubscribe-auth/AFeJm6n37Bce-syWmo8WM32iwYHkmxX_ks5qlTylgaJpZM4JwcZN .

csm101 commented 8 years ago

Android: Confirmed OK IOS 32 Bit : Confirmed OK IOS 64 Bit : Confirmed OK

opps, I didn't read this :) sorry

Spelt / ZXing.Delphi

utf8 text in qrcodes was not handled properly #39