aspnet / Localization

[Archived] Localization abstractions and implementations for ASP.NET Core applications. Project moved to https://github.com/aspnet/AspNetCore
Apache License 2.0
157 stars 65 forks source link

Pluralization #17

Closed hishamco closed 6 years ago

hishamco commented 9 years ago

Pluralization is a complex problem, as different languages have a variety of complex rules for pluralization. You may easily manage this in your language files using many techniques such as

Notification => You have [one message | many messages] in your inbox

Notification => You have one message in your inbox | You have many messages in your inbox

this can be done by introducing an optional parameter isPlural in GetString method, to get the pluralized text from the actual value

hishamco commented 9 years ago

/cc @DamianEdwards

Eilon commented 9 years ago

There is no such general concept of pluralization across languages. For example, in US English, "1" is singular, and all other numbers are plural (0, fractions, etc.). But other languages have rules where "1, 2, and 3" each have distinct rules, and all other numbers are "plural."

I'm not sure there's any general purpose concept of "plural" that could be used here. You might as well just have N strings in your resources, and then have your own culture-specific logic to pick the right one.

hishamco commented 9 years ago

@Eilon I know that every language has it's own rules for pluralization. Assume we have the following resource Comment => Posted by {0} {1} [day | days] ago. calling the following function GetString("Notification","Elion",1, isPlural: false) should return the singular part from the resource and we will get Posted by Elion 1 day ago., but calling GetString("Notification","Elion",5, isPlural:true) should return the plural part from the resource and we will get Posted by Elion 5 days ago.

Eilon commented 9 years ago

But what if a language has more than 2 cases? For example, in Hebrew many words have a special spelling when it's two of that item. So the cases are "1 day" (singular), "2 dayX" (special double plural), and "N days" (regular plural). It's not the case of plural=true/false.

hishamco commented 9 years ago

@Eilon Hebrew like my language Arabic which has three forms: singular, dual and plural. If i'm not wrong we have 6 plural forms which may the largest number of plural forms in the languages world wide and perhaps the complicated one :) Let us have a look to the following:

1 day <=> يوم 2 days <=> يومين 3 days <=> أيام 3 11 days <=> 11 يوما 100 days <=> 100 يوم

if you look at bing or google translators they treat the dual as plural which may breaks the rules of the language :( nothing but we can think differently by adding expression to determine the proper translation like Yii Frameowrk perhaps it is complicated!! This may need a proper design but I come with a humble idea by adding a Func<int,int> parameter that represent the rules of the language and return the part index that should be taken within the brackets If we tried the following code for the English language

GetString("Notification","Elion", 2, func(2));

private int func(int n)
{
     if(n!=1)
          return 1;
     else
          return 0;
}

the Func will return 1 that indicate the second part from [day | days] and this will produce Posted by Elion 2 days ago.

We can apply the same mechanism using Arabic language ‫Notification => تم النشر من قبل {0} منذ [ {0} يوم | يوم | يومين | {1} أيام | {1} يوما | {1} يوم ]

GetString("Notification","Elion", 7, func(7));

private int func(int n)
{
     if(n==0)
            return 0;
        else if(n==1)
            return 1;
        else if(n==2)
            return 2;
        else if(n%100>=3&&n%100<=10)
            return 3
        else if(n%100>=11)
            return 4;
        else
            return 5;
}

the Func will return 3 that indicate the fourth part from ‫تم النشر من قبل {0} منذ [ {0} يوم | يوم | يومين | {1} أيام | {1} يوما | {1} يوم ] and this will produce ‫تم النشر من قبل Elion منذ 7 أيام. which is of course the correct result in my language

glen-84 commented 9 years ago

@hishamco, @Eilon,

See my feedback here and here.

I created a proof of concept here, and the usage can be seen here. (Note: this is based on Damian's i18nStarterWeb, so it's a bit outdated)

The example shows pluralization using SmartFormat.NET as well as messageformat.net.

The only thing that I'm not sure about is this:

6) "Note that ValidationAttribute is in the .NET Framework and can't really be changed ..." – how does the ErrorMessage get localized? Are you suggesting that you won't be able to use a different type of string formatting, for example?

I also haven't really thought about whether text/message domains would be useful, or if they could be implemented without major API changes.

hishamco commented 9 years ago

Thanks @glen-84 for your feedback .. I appreciate it

hishamco commented 8 years ago

@glen-84 @Eilon I wrote an articles about the Pluralization Syntax hope it gets a proof of concept, as @DamianEdwards mentioned yesterday in ASP.NET Community Stand-Up that the Pluralization may support in the future. Again the pluralization is not an easy task, it needs a good design ..

brgrz commented 7 years ago

@Eilon @hishamco @glen-84 @DamianEdwards There really is no need to reinvent the wheel here, other frameworks offer much more comprehensive i18n story, for instance Angular 2, which bases their i18n efforts on XLIFF and XMB standards. Those cover most of the advanced i18n scenarios. https://angular.io/guide/i18n

hishamco commented 7 years ago

@brgrz to be clear AngularJS provide localization like other frameworks, but don't forget that AngularJS works at client-side furthermore to apply the localization you need to use the AngularJS directives

This may useful for SPA

brgrz commented 7 years ago

@hishamco Yeah, and? How's that different from what we have at server side with Razor and tag helpers?

glen-84 commented 7 years ago

@brgrz I don't think that any wheels are being re-invented here. You could probably implement support for XLIFF and XMB just as easily as I implemented (PoC) support for ICU MessageFormat syntax.

hishamco commented 7 years ago

Frankly I didn't tried AngularJS localization yet, but I think you are limited to JSON or JS files as resource files, because everything done at client-side, on other hand ASP.NET Core build on extensibility taken in mind, so you can use the RESX files to store the resources, or you can create a custom on such as PO, JSON, EF .. etc. Similar things to the RequestProviderCulture you can use Cookies, Route Data, QueryString, Accept Language Headers or you can use your own such as Session .. etc

IMHO there are a big difference between the two and no wheels need to re-invent

brgrz commented 7 years ago

@hishamco Actually Angular 2 uses XLIFF files for i18n which are XML. But, you are talking about persistence and I was talking about standards on which to base the whole i18n story. Persistence/storage mechanisms are pretty much irrelevant or only relevant when everyone has to invent their own i18n design and support more complex scenarios themselves (this is the case now with MVC 5 and, as it seems, MVC 6 too because RESX really aren't too helpful when it comes to complex cases).

ICU message formats http://userguide.icu-project.org/formatparse/messages

brgrz commented 7 years ago

@glen-84 I could but I was hoping MS would push things forward and solve i18n as it should be from the get-go that's why I was pretty disappointed when I read this https://docs.microsoft.com/en-us/aspnet/core/fundamentals/localization

Actually i18n used to be one of the top three issues on ASP.NET's UserVoice but it's gone now. I guess they cleaned up the issue list when they released Core but didn't actually solve the problem.

i18n is still the most neglected part of the framework.

hishamco commented 7 years ago

I'm agree with you regarding the complexity of the RESX, so that's why I prefer a new localization source beside ResourceManagerStringLocalizer 😄

aspnet-hello commented 6 years ago

This issue was moved to aspnet/Home#2653