Closed Gnbrkm41 closed 4 years ago
Also seeing test failures that look very similar, albeit from System.Globalization.Tests
:
As of ad97a12, The failing tests are located at this file: https://github.com/dotnet/runtime/blob/ad97a127d91c53080265794c7a57e0faee4d7c2d/src/libraries/System.Globalization/tests/CompareInfo/CompareInfoTests.Compare.cs#L10
At line number 50, 51, 65, 68, 69, 70, 71, 74, 119, 135, 231, 232, 233, 234, 236.
For SortKey tests, they are located at: https://github.com/dotnet/runtime/blob/ad97a127d91c53080265794c7a57e0faee4d7c2d/src/libraries/System.Globalization/tests/CompareInfo/CompareInfoTests.cs#L11
At line number 137, 138, 152, 155, 156, 157, 158, 161, 206, 222.
Looking at the values, it appears that half-width Katakanas with Dakutens do not compare equal to the full-width equivalent for some reason (However, half-width katakanas without Dakutens compare equal with the full-width equivalent); in addition, the sort order for some of them seems to have changed as well.
This is a regression in Windows side and I already communicated this to them and they are working to fix it. will be good if we can update the test to avoid running it on the failing build version of Windows.
CC @ShawnSteele
I've been noticing the failures of those two tests for a few months, back from the pre-consolidation corefx times: https://github.com/dotnet/runtime/blob/ad97a127d91c53080265794c7a57e0faee4d7c2d/src/libraries/System.Data.Common/tests/System/Data/SqlTypes/SqlStringSortingTest.cs#L39-L54
Digging through, the failure specifically seems to be related with these particular test strings: https://github.com/dotnet/runtime/blob/ad97a127d91c53080265794c7a57e0faee4d7c2d/src/libraries/System.Data.Common/tests/System/Data/SqlTypes/SqlStringSortingTest.cs#L30 Which is
(ファズ・ギター, ファズ・ギター)
. They are expected to compare equal but for some reason they do not and thus failing.Given that they're basically just
CompareInfo.Compare
with a few options, I've boiled it down to this:This returns 1 on 19536 / ko-KR, 0 on 18362 / de-DE (maybe installed with en-US?), -1 on 19041 / en-GB.
It appears that either the locale setting or the build version affect the comparison result; I'm suspecting that it's caused by Dakutens, because if I compare two same Japanese kana characters with Dakuten (the two dots on the upper right side of the characters) that differ only in the width, e.g.
ガ
('\u30AC'
) andガ
("\uFF76\uFF9E"
), they compare different for some reason. I expect those to compare equal when passedCompareOptions.IgnoreWidth
option, as that makes logical sense.Also, a similar issue albeit on Linux;
U+FF70 Halfwidth Katakana-Hiragana Prolonged Sound Mark
andU+30FC Katakana-Hiragana Prolonged Sound Mark
has weird quirks when compared in various combinations. I have not ran the tests on a Linux machine so not sure if the tests actually fail but it might:(Using the same code,
CompareInfo.GetCompareInfo("ja-JP").Compare(left, right, CompareOptions.IgnoreCase | CompareOptions.IgnoreKanaType | CompareOptions.IgnoreWidth)
)タ
タ
ー
ー
ター
ター
I'm honestly not sure if there's anything we could do at framework level because they seem to be issues with the ICU/NLS; but thought I'd post it here first since it is causing a local test failure and I have no idea where else to ask about this.
cc @tarekgh