Humanizr / Humanizer

Humanizer meets all your .NET needs for manipulating and displaying strings, enums, dates, times, timespans, numbers and quantities
Other
8.63k stars 962 forks source link

RomanNumeralExtensions.FromRoman incorrectly handles Latin Small Letter Dotless I #1507

Closed stephentoub closed 4 months ago

stephentoub commented 4 months ago

The regex allows it through (as it's using current culture with ignore case) but the dictionary doesn't contain an entry that maps to it (as it's using ordinal ignore case), and thus the dictionary lookup throws:

using Humanizer;
using System.Globalization;

CultureInfo.CurrentCulture = new CultureInfo("tr-TR");
Console.WriteLine(RomanNumeralExtensions.FromRoman("\u0131"));

results in:

Unhandled exception. System.Collections.Generic.KeyNotFoundException: The given key 'i' was not present in the dictionary.
   at System.Collections.Generic.Dictionary`2.get_Item(TKey key)
   at Humanizer.RomanNumeralExtensions.FromRoman(ReadOnlySpan`1 input) in /_/src/Humanizer/RomanNumeralExtensions.cs:line 95
   at Humanizer.RomanNumeralExtensions.FromRoman(String input) in /_/src/Humanizer/RomanNumeralExtensions.cs:line 71
   at Program.<Main>$(String[] args)

My guess is the intent was for the regex to be using invariant culture, such that only i and I are permitted, and not ı or İ.