dotnet / docs

This repository contains .NET Documentation.
https://learn.microsoft.com/dotnet
Creative Commons Attribution 4.0 International
4.12k stars 5.8k forks source link

"(:ss|:s)" regex in the milliseconds example does not match FullDateTimePattern in all cultures #40356

Open KalleOlaviNiemitalo opened 3 weeks ago

KalleOlaviNiemitalo commented 3 weeks ago

Type of issue

Code doesn't work

Description

In "How to: Display milliseconds in date and time values", the example inserts a milliseconds format specifier into the format string by searching for the "(:ss|:s)" regular expression in DateTimeFormatInfo.FullDateTimePattern:

https://github.com/dotnet/docs/blob/0dd37de7264abac2c869eceb977fdba77edbaa23/docs/standard/base-types/snippets/how-to-display-milliseconds-in-date-and-time-values/csharp/Program.cs#L24-L30

In some cultures however, DateTimeFormatInfo.FullDateTimePattern does not contain any colons and the regex thus won't match. The sample code won't display any milliseconds in those cultures. With .NET 8.0.1 on Windows, those cultures are:

Name EnglishName FullDateTimePattern LongTimePattern
as-IN Assamese (India) dddd, d MMMM, yyyy tt h.mm.ss tt h.mm.ss
da-DK Danish (Denmark) dddd 'den' d. MMMM yyyy HH.mm.ss HH.mm.ss
da-GL Danish (Greenland) dddd 'den' d. MMMM yyyy HH.mm.ss HH.mm.ss
en-DK English (Denmark) dddd, d MMMM yyyy HH.mm.ss HH.mm.ss
en-FI English (Finland) dddd, d MMMM yyyy H.mm.ss H.mm.ss
fi-FI Finnish (Finland) dddd d. MMMM yyyy H.mm.ss H.mm.ss
fr-CA French (Canada) dddd d MMMM yyyy HH 'h' mm 'min' ss 's' HH 'h' mm 'min' ss 's'
id-ID Indonesian (Indonesia) dddd, dd MMMM yyyy HH.mm.ss HH.mm.ss
kl-GL Kalaallisut (Greenland) yyyy MMMM d, dddd HH.mm.ss HH.mm.ss
si-LK Sinhala (Sri Lanka) yyyy MMMM d, dddd HH.mm.ss HH.mm.ss
smn-FI Inari Sami (Finland) dddd, MMMM d. yyyy H.mm.ss H.mm.ss

This caused bug https://github.com/dotnet/aspire/issues/3127 in the .NET Aspire dashboard.

Please correct the example so that it detects the seconds format specifier in these cultures too.

Page URL

https://learn.microsoft.com/en-us/dotnet/standard/base-types/how-to-display-milliseconds-in-date-and-time-values

Content source URL

https://github.com/dotnet/docs/blob/main/docs/standard/base-types/how-to-display-milliseconds-in-date-and-time-values.md

Document Version Independent Id

80f9e353-0de8-c715-233e-033e6dd3c375

Article author

@adegeo

Metadata

KalleOlaviNiemitalo commented 3 weeks ago

Perhaps it can be done with a regular expression like

^(?:[^s"'\\]|"[^"]*"|'[^']*'|\\.)*ss?

where the parenthesised part matches any number of non-s format specifiers or quoted or backslash-escaped literals; and the ss? part then matches the seconds.

adamsitnik commented 1 week ago

cc @tarekgh as it seems to be globalization related issue

tarekgh commented 1 week ago

@adamsitnik globalization is working as expected. The issue here is the regex sample is not correct. Maybe using DateTimeFormatInfo.TimeSeparator instead of hardcoding : can help. @KalleOlaviNiemitalo way is worth to try too.

KalleOlaviNiemitalo commented 1 week ago

Maybe using DateTimeFormatInfo.TimeSeparator instead of hardcoding : can help.

I don't think it can work with the French (Canada) HH 'h' mm 'min' ss 's'.

tarekgh commented 1 week ago

You are right @KalleOlaviNiemitalo. I think the pattern (?<!\w)(s{1,2})(?!\w)(?=(?:[^'"]|'[^']*'|"[^"]*")*$) will help better? Anyway, we need to ensure whatever pattern we get is going to work with all time formats we have.

KalleOlaviNiemitalo commented 1 week ago

I don't see the point of the (?<!\w) and (?!\w) negative assertions; should there ever be a culture with just HHmmss, then those would prevent the regex from matching. If you want assertions, I think they should be (?<!s) and (?!s).

The reason I wrote the regex like I did was to make sure it recognises which single or double quotation marks are opening and which ones are closing, no matter how many pairs of those are before or after the seconds format specifier. The regex will be copied from the sample to users' applications and then perhaps never updated, so it would be good to proactively support format strings that are not currently used in any cultures.