Closed runebrg closed 2 years ago
Tagging subscribers to this area: @dotnet/area-system-globalization See info in area-owners.md if you want to be subscribed.
Author: | runebrg |
---|---|
Assignees: | - |
Labels: | `area-System.Globalization` |
Milestone: | - |
FWIW I don't know specifics of Danish/Norwegian culture but in Polish we also have special phonetic characters (i.e. sz
) and I'd be surprised if this:
string test = "Pszczoła";
Console.WriteLine(test.StartsWith(test.Substring(0, 2), false, CultureInfo.CreateSpecificCulture("pl-PL"))); // true
ever returned false (this works as I'd expect for Polish) so it makes sense that this works consistently across other cultures as well.
@runebrg this behavior is defined by the Unicode standard. aa
is considered equivalent to Å
. You may look at the history for more info. Look at the similar issue https://github.com/dotnet/runtime/issues/72770.
If you disagree with this behavior, you may log a ticket to ICU unicode-org.atlassian.net/jira/software/c/projects/ICU/issues.
I agree that aa
should be considered equivalent to å
in many cases in Norwegian (though not always, this is context dependent). But I still think the .NET framework behaves inconsistently here. Both "aa".StartsWith("a")
and "aa".StartsWith("å")
are false, but "aa".StartsWith('a')
and "aa".Contains("a")
are true
Fiddle: https://dotnetfiddle.net/RcbfSa
But I still think the .NET framework behaves inconsistently here.
It does, but this behavior is documented and I believe it can't be changed, because that would break backwards compatibility.
The following operations are performed as ordinal operation and not linguistic operation. You can achieve the same things with StartsWith and input string by doing something like "aa".StartsWith("a", StringComparison.Ordinal)
. This should return true.
Console.WriteLine("aa".Contains("a")); //True
Console.WriteLine("aa".StartsWith('a')); //True
Also, consistency with .NET Framework can be achieved if you enable the NLS mode. We don't recommend that though.
Description
String.StartsWith() will sometimes return the wrong value if the string contains "AA" and culture is set to Norwegian (nb-NO) or Danish (da-DK)
Even though the double A has a special meaning in Norwegian, I would expect
s.StartsWith(s.Substring(0, 2))
to always return trueExample .net fiddle: https://dotnetfiddle.net/h4u01x
Reproduction Steps
var s = "BAAC"; var b = s.StartsWith(s.Substring(0, 2), false, CultureInfo.CreateSpecificCulture("nb-NO"));
Expected behavior
b should be true
Actual behavior
b is false
Regression?
Using .NET 4.7.2, it works as expected for Norwegian, returning true. But for Danish it is still false.
Example: https://dotnetfiddle.net/FCwqGH
Known Workarounds
Specifying InvariantCulture fixes the problem.
Configuration
No response
Other information
No response