Closed SeanKilleen closed 1 year ago
Tagging subscribers to this area: @dotnet/area-system-globalization See info in area-owners.md if you want to be subscribed.
Author: | SeanKilleen |
---|---|
Assignees: | - |
Labels: | `area-System.Globalization` |
Milestone: | - |
Setting aside the culture data question, your actual problem is this:
I ran into an issue where a unit test around the time formatting failed
You shouldn't be writing tests with dependencies on data that
Generally testing output falls into one of two camps:
ShortTimePattern
, not that it formats it as 4AM
or whatever. This can either be done with CultureInfo.Invariant
, or by constructing a testing-specific custom culture (Microsoft actually uses this to ensure that UI elements even have localization set up).@Clockwork-Muse typically I'd agree but in this case, this very specific formatting is required for an output for users regardless of what machine they or we are operating on. It goes into an email for users. I happened to be writing this code and these tests specifically because another developer had gotten the formatting wrong, so I assure you it's a valid test case.
My client is an international organization that in some cases needs things to default to a specific culture. One of the reasons we have this test is to ensure the output will be as we expect on whatever environment we run the tests on. it's not foolproof, but it's a much higher degree of confidence.
Please note that my tests accomplished exactly what they needed to, with the exception of discovering the non-breaking space. All the test cases in this issue are simplified for reproduction purposes.
So I'd like not to set aside the culture data question, but rather focus on it. 👍
Seems the possible outcomes here will be one of the following:
Regardless, I hope at the end of this I'll understand:
Getting NBSP inside the AM/PM designators is what CLDR decided to do. You may look at https://unicode-org.atlassian.net/jira/software/c/projects/CLDR/issues/CLDR-11469?jql=project%20%3D%20%22CLDR%22%20AND%20text%20~%20%22NBSP%22%20ORDER%20BY%20created%20DESC for that.
.NET not directly mapping to the CLDR but this mapping is done on the level of the ICU native library APIs. In .NET we are calling the correct API to get the needed data as show here. UDAT_AM_PMS.
In summary, .NET is working as expected and the data returned is the expected data according to CLDR. I am closing the issue but feel free to send any question you think we can help with. Thanks for the report.
Thanks for clarifying, @tarekgh, and pointing me to the supporting information! Much appreciated. 👍
this very specific formatting is required for an output for users regardless of what machine they or we are operating on. ... One of the reasons we have this test is to ensure the output will be as we expect on whatever environment we run the tests on.
If it's that specific, you should be constructing a permanent, static, culture with the specific formatting requirements, not pulling from environment data.
My client is an international organization that in some cases needs things to default to a specific culture.
This is a separate concern, unrelated to the specific formatting observed. Some languages/regions more tightly control formatting, and if you want your application to update to have the new data, then you should only be testing that the expected default culture is passed (eg, comparing the culture id), not what the formatting output it provides is.
Description
As part of a client project using .NET 6, I am displaying a date/time in the
es-ES
locale.I ran into an issue where a unit test around the time formatting failed, but the two strings looked exactly alike. So I started digging and I think it's possible there may be a small issue (or it may be by design).
Reproduction Steps
In a .NET 6 with xUnit and FluentAssertions:
See the formatting issue:
So then I figured I'd check the character:
I know as of .NET 5 (IIRC), standardization was moved to ICU.
So I looked at the ICU Locale file: https://github.com/unicode-org/icu/blob/main/icu4c/source/data/locales/es.txt
And I copied / pasted those values into
InlineData
in another test:Some of those tests fail, reporting that a character code of 8239 -- a unicode non-breaking space is being used.
I think this maps to the AMDesignator and PMDesignator fields -- these tests will fail:
Theory
Given the test results and the likelihood of mapping, my current theory is that:
AMDesignator
andPMDesignator
fields.AMDesignator
/PMDesignator
aspects ofCultureInfo
tt
for that culture.Expected behavior
DateTime.ToString("tt")
for aCultureInfo
ofes-ES
will map toa. m.
orp. m.
(with an ASCII space).Actual behavior
DateTime.ToString("tt")
for aCultureInfo
ofes-ES
outputsa. m.
orp. m.
using an ASCII non-breaking space for the space after the first.
Regression?
I'm not sure.
Known Workarounds
Configuration
Other information
No response