Open stefanrbk opened 4 years ago
Thanks @stefanrbk this is an interesting find. Are you able to confirm if you're using .NET 5?
From my testing on Windows I can only exactly reproduce the error message Strings differ at index 5.
on .NET 5.
For all other runtimes (NET Core 2.1, 3.1 and all of .NET Framework 3.5+) I see the error message Strings differ at index 0.
. I don't see the \u200E
mark inserted around the negative sign when calling ToString()
(hence the difference).
I think this is a general issue on all runtimes it just may be more prevalent on NET 5+ given the globalization changes The following code also wrongly displays the first difference (on all platforms):
[Test]
public void ToStringTest()
{
var actual = "\u200E\u200E\u200E\u200E\u200E\u200E\u200E\u200E1768";
var expected = "\u200E\u200E\u200E\u200E\u200E\u200E\u200E\u200E\u200E768";
Assert.That(actual, Is.EqualTo(expected));
}
String lengths are both 12. Strings differ at index 8. Expected: "768" But was: "1768" -------------------^
One thing to note is that it is doing what it is supposed to with the string position. Maybe the arrow below needs to count Glyphs instead of string chars. That or call out the hex values where the strings differ when the difference is a non printable character.
I am doing some tests on a class with a
ToString(string? format, IFormatProvider? provider)
method to make sure it returns the correct format in different languages/cultures. One such number I'm testing with is-32768
to test for the thousands separator and the negative sign formatting. When I test usingur-IN
for the Urdu language in India with an error with the thousands separator, the test summary message doesn't make sense on first glance.The message generated from the string comparison by
Assert.That
is the following:This test fails because char[5] (the thousands separator) is supposed to be '\u066C' instead of '\u066B'. What I get in the test message is what confused me when trying to fix the bug because I didn't look at the
Strings differ at index 5.
. I looked at the arrow pointing to the6
.ur-IN
uses a "Left-To-Right Mark" '\u200E' before and after the negative sign. The message is just counting thechar
s and not the actual visual location of the diff in the strings.I have been working with Unicode for the last month or so and things keep throwing me off, making me chase down bugs in the wrong places.
A suggestion for after the bug is investigated, can the string compare message specify the UTF value of the string difference?