If the text is not in UTF-8 then appending to to .text fails (renamed ->.datahere for output), but the content is available in .bytes (renamed -> .binaryData for output), so this isn't the end of the world.
qrcode seems to make the same assumption, so you won't have interoperability problems if you keep all your QR coding in Javascript, and you also probably won't have problems unless you're mixing languages, but the bug is lurking.
The QR standard was written in an earlier age. As the authors of a popular implementation, you could probably get away with just declaring the UTF-8 is the only valid text encoding, which is a 67-times better solution than messing with the proprietary ECI character sets that are rapidly being forgotten. If you go that route, I would ask for a prominent note in the docs explaining that you're specifically only implementing a (sane) subset of the QR spec, and that you coordinate with qrcode to give up their plan to implement ECI and apply the same UTF-8-only restriction.
BYTE mode implicitly assumes data is in UTF-8 because that's what
encodeURIComponent
/decodeURIComponent
assume:https://github.com/cozmo/jsQR/blob/01d3b0a3889b6da02486ea5c26e5bfaaa268d61a/src/decoder/decodeData/index.ts#L129-L146
If the text is not in UTF-8 then appending to to
.text
fails (renamed ->.data
here for output), but the content is available in.bytes
(renamed ->.binaryData
for output), so this isn't the end of the world.It is, however, against the QR Spec. You implemented ECI in #71 by just recording them in the chunks but you're supposed to use them to set the character set in use for each Bytes block. The newer fork of
zxing
handles this by keepingCurrentCharset
around and using it when it runs into a BYTE segment (and the old fork does basically the same thing withcurrentCharacterSetECI
). That means having a charset-to-charset conversion library likeiconv
available, or hardcoding the to-unicode translations.qrcode
seems to make the same assumption, so you won't have interoperability problems if you keep all your QR coding in Javascript, and you also probably won't have problems unless you're mixing languages, but the bug is lurking.The QR standard was written in an earlier age. As the authors of a popular implementation, you could probably get away with just declaring the UTF-8 is the only valid text encoding, which is a 67-times better solution than messing with the proprietary ECI character sets that are rapidly being forgotten. If you go that route, I would ask for a prominent note in the docs explaining that you're specifically only implementing a (sane) subset of the QR spec, and that you coordinate with
qrcode
to give up their plan to implement ECI and apply the same UTF-8-only restriction.