(Bug) Assertion Doesn't Handle Non-printible Characters Correctly in Message

stefanrbk commented 3 years ago

I am doing some tests on a class with a ToString(string? format, IFormatProvider? provider) method to make sure it returns the correct format in different languages/cultures. One such number I'm testing with is -32768 to test for the thousands separator and the negative sign formatting. When I test using ur-IN for the Urdu language in India with an error with the thousands separator, the test summary message doesn't make sense on first glance.

// Simplified from actual code
[Test]
public void ToStringTest() =>
    // Effectively my method I'm testing calls ToString on a double like this.
    Assert.That((-32768.0).ToString("#,0.################", CultureInfo.CreateSpecificCulture("ur-IN")), Is.EqualTo("\u200E-\u200E32\u066B768"));

The message generated from the string comparison by Assert.That is the following:

String lengths are both 9. Strings differ at index 5.
      Expected: "‎-‎32٫768"
      But was:  "‎-‎32٬768"
      ----------------^

This test fails because char[5] (the thousands separator) is supposed to be '\u066C' instead of '\u066B'. What I get in the test message is what confused me when trying to fix the bug because I didn't look at the Strings differ at index 5.. I looked at the arrow pointing to the 6. ur-IN uses a "Left-To-Right Mark" '\u200E' before and after the negative sign. The message is just counting the chars and not the actual visual location of the diff in the strings.

I have been working with Unicode for the last month or so and things keep throwing me off, making me chase down bugs in the wrong places.

A suggestion for after the bug is investigated, can the string compare message specify the UTF value of the string difference?

stevenaw commented 3 years ago

Thanks @stefanrbk this is an interesting find. Are you able to confirm if you're using .NET 5?

From my testing on Windows I can only exactly reproduce the error message Strings differ at index 5. on .NET 5. For all other runtimes (NET Core 2.1, 3.1 and all of .NET Framework 3.5+) I see the error message Strings differ at index 0.. I don't see the \u200E mark inserted around the negative sign when calling ToString() (hence the difference).

stevenaw commented 3 years ago

I think this is a general issue on all runtimes it just may be more prevalent on NET 5+ given the globalization changes The following code also wrongly displays the first difference (on all platforms):

[Test]
public void ToStringTest()
{
    var actual = "\u200E\u200E\u200E\u200E\u200E\u200E\u200E\u200E1768";
    var expected = "\u200E\u200E\u200E\u200E\u200E\u200E\u200E\u200E\u200E768";

    Assert.That(actual, Is.EqualTo(expected));
}

String lengths are both 12. Strings differ at index 8. Expected: "‎‎‎‎‎‎‎‎‎768" But was: "‎‎‎‎‎‎‎‎1768" -------------------^

stefanrbk commented 3 years ago

One thing to note is that it is doing what it is supposed to with the string position. Maybe the arrow below needs to count Glyphs instead of string chars. That or call out the hex values where the strings differ when the difference is a non printable character.

nunit / nunit

(Bug) Assertion Doesn't Handle Non-printible Characters Correctly in Message #3678