Open Shane32 opened 2 months ago
Time for a little history lesson. Unfortunately, the question is not quite so easy to answer...
In the original implementation of the QRCoder, the CreateQrCode
method had neither a forceUtf8
nor an eciMode
parameter. The idea was to keep the interface as simple as possible, so that the QRCoder always selects the best ECI mode independently. (“The best” is defined as “the encoding with which the smallest QR code is generated or the most user data can be stored”).
Here I think I follow the ISO standard, which says: “If you only have numeric characters, encode as numeric. If you also have letters, use Alphanumeric, ...”
Unfortunately, the world is not always that simple and some QRCoder readers always expected the user data in ByteMode. If data came in ISO 8859-1 format, these readers displayed special characters incorrectly. (In my opinion, the fault lies with the readers, which do not fully map the ISO standard).
In order not to stubbornly insist on the standard and to give users of QRCoder the choice, I have introduced the parameter forceUtf8
. (With the explicit note that this parameter breaks with the standard!)
This was done in:
For mapping SlovenianUpn QR codes, it is necessary to encode QR codes in ISO 8859-2 mode. (This is a requirement from the UPN QR Form Standard)
However, the automatic ECI mode selection in the QRCoder has not yet taken this encoding into account. Although the QR code standard allows the encoding, it lists it as an encoding that should be set specifically at the user's request.
The eciMode
parameter was introduced to implement the UPN QR code. This was done in:
As you noted @Shane32, the forceUtf8
is actually obsolete. I can no longer say exactly why I didn't remove it in 2018. Either I simply didn't correctly recognize the functional duplication as such at the time or, as is so often the case, I was worried that removing the parameter would result in a breaking change that could annoy our loyal user base.
Long story short: The forceUtf8
parameter is a relic of old times and can probably be removed. However, this should be done carefully. (Possibly a legacy overload that still accepts the parameter and sets the ECI mode in the background. The overload can be marked as obsolete at the same time and then removed completely with version 2.0).
Or do you think we should make a hard cut here and remove the parameter completely?
I want to think about it some more, but generally I'm thinking:
EciMode
property to select the default encoding. If we need to add additional properties to Payload
, such as ForceUtf8
then do so. So for SlovenianUpn QR codes and similar, it is all handled by properties set on Payload
.I'll have to think about it some more when I have additional time.
I'll have to think about it some more when I have additional time.
So should I wait with merging #514 ?
Yeah; I’ll mark it WIP. I’d rather have a consensus before we merge anything.
Question
When generating a QR code with
QRCodeGenerator.CreateQrCode
, what is the purpose offorceUtf8
versuseciMode: EciMode.Utf8
?I would expect that specifying
EciMode.Utf8
would force the generated code to use the UTF-8 character set when encoding the data. But in fact, it does so only when either (a) there are non-ISO-8859-1 characters present, or (b)forceUtf8
istrue
.Consider the following:
A similar phenomenon occurs when specifying
EciMode.Iso8859_2
with specific strings -- interestingly,forceUtf8
must betrue
for the code to function correctly.If we assume that this is a bug, and
EciMode.Utf8
should always encode with UTF-8, then what is the purpose for theforceUtf8
argument? My best guess is that it is some compatibility mode to use UTF-8 encoding when encoded withEciMode.Default
(which normally encodes as ISO-8859-1).Suggestion
I suggest that:
EciMode
besidesEciMode.Default
, the text is always encoded in the specified encoding.EciMode.Default
withforceUtf8 == false
, it uses ISO-8859-1 (per spec,EciMode.Iso8859_1
is default)EciMode.Default
withforceUtf8 == true
, it uses UTF-8 (against spec)Spec
From ISO spec page 20: