jeffijoe / messageformat.net

ICU MessageFormat implementation for .NET.
MIT License
161 stars 23 forks source link

Support InvariantGlobalization / default to InvariantCulture #41

Open glen-84 opened 9 months ago

glen-84 commented 9 months ago

When enabling InvariantGlobalization (which I believe is now the default in API/gRPC/Worker project templates), the library fails with:

System.Globalization.CultureNotFoundException : Only the invariant culture is supported in globalization-invariant mode. See https://aka.ms/GlobalizationInvariantMode for more information. (Parameter 'name')

en is an invalid culture identifier.

   at System.Globalization.CultureInfo..ctor(String name, Boolean useUserOverride)
   at Jeffijoe.MessageFormat.Formatting.Formatters.VariableFormatter.GetCultureInfo(String locale)
   at Jeffijoe.MessageFormat.Formatting.Formatters.VariableFormatter.Format(String locale, FormatterRequest request, IReadOnlyDictionary`2 args, Object value, IMessageFormatter messageFormatter)
   at Jeffijoe.MessageFormat.MessageFormatter.FormatMessage(String pattern, IReadOnlyDictionary`2 args)

Would it make sense to default to the invariant culture (empty string "") instead of "en"?

This affects static usage – with a custom instance you could simply specify the locale/culture yourself.

It's a breaking change, but worth it in my opinion.

An alternative would be to allow the locale on the default instance to be set.

jeffijoe commented 9 months ago

Interesting, I had no idea. Is that a new thing they are doing?

I agree, changing the default to invariant makes sense. We should release that as a breaking change.

glen-84 commented 9 months ago

Is that a new thing they are doing?

Yes, fairly new. See https://github.com/dotnet/aspnetcore/issues/47029.

jeffijoe commented 9 months ago

Do you think setting the default to invariant is better, or should we use whatever the current culture is?

glen-84 commented 9 months ago

Good question. I think that using the current culture is a better idea. 👍

jeffijoe commented 9 months ago

Sounds good to me!

NightOwl888 commented 9 months ago

Yes, by default, the current culture is used unless you specify one with the CultureInfo overload.

However, MessageFormat and other Java-style formatters are built upside down so the culture is specified at object creation instead of being passed into the Format method, which would be more practical in a multi threaded app and conform with the rest of the dotnet culture behaviors.

You could "fix" this by adding a constructor overload that doesn't accept culture, but it seems broken when the culture of the current thread changes to something else and this instance still uses the culture from the time it was created.

jeffijoe commented 9 months ago

How often does the current culture change though?

NightOwl888 commented 9 months ago

In a web app, the culture is tied to the thread of the current request. So, every request can be in a different culture. This means that the instance of MessageFormat that you create is only valid for the thread it was created on. That is, it isn't threadsafe. This is fine as long as you either create and garbage collect a MessageFormat instance on each request or put it in a cache that uses the culture as a key so it can be reused by multiple threads that are using that culture. But creating an instance on each request is expensive and adding caching to share MessageFormat instances across threads is not very practical.

Of course, the app also can change the culture of the current thread as needed which can also make the MessageFormat instance go out of sync with the current thread. I am not sure how often that would change outside of a shared web application, which sets it at the beginning of the request and it typically stays set to the same culture until the end of the request. But since only the current thread can change the culture of itself, it will never happen in the middle of an operation. It is possible it could happen multiple times, though.

The way Microsoft gets around this issue for its formatters are to make them static methods. CultureInfo actually contains all of the culture data, so that data lives on the thread and the business logic of the formatter static method can then be called by multiple threads and pass in different culture data. IMO, it would be best to do that here, also. Then creating the MessageFormat instance would just be an optional thing to make it look and feel like doing message formatting in other programming languages. And you could have a CultureInfo overload of Format() in that object so it would behave in a threadsafe way, without tying the instance of the formatter to any specific culture. If you call the overload with no CultureInfo, it would use the CultureInfo from the current thread. So, if the next operation the current thread does is to change the culture to something else, you wouldn't need to create a new MessageFormat instance for it to respond to the change. Instead, the MesssageFormat instance would call into the static method with the new CultureInfo instance and the user would see the expected culture-formatted result.

jeffijoe commented 9 months ago

MessageFormat instances are still useful for caching parsed messages.

Tbh I'm fine changing from passing the culture to the constructor to passing it via a parameter, it'll just be a breaking change. Unless we make it optional and keep the constructor one as a fallback.