sillsdev / l10nsharp

A .NET localization library for Windows Forms applications.
3 stars 10 forks source link

Handle languages the system doesn't know about (BL-5260) #28

Closed StephenMcConnel closed 6 years ago

StephenMcConnel commented 6 years ago

We can't depend on CultureInfo working for every language that our programs may be localized into. Mono 4 throws exceptions while .Net 4.6.1 creates generally worthless objects for languages that the authors didn't know about or didn't consider worth implementing.


This change is Reviewable

JohnThomson commented 6 years ago

Looks mostly good, with a few minor points. There are classes in LibPalaso which know at least English names for all the Ethnologue languages. It's probably not desirable for L10NSharp to depend on LibPalaso, especially since SIL.Windows.Forms.WritingSystems already depends on L10NSharp; but I wonder whether there's any benefit in allowing a client to override the algorithm for naming unknown cultures? Tricky because ideally we'd want L10NSharp to define an interface and libpalaso to implement it, or slightly more simply for libpalaso to both define and implement it. Probably best left YAGNI.


Review status: 0 of 5 files reviewed at latest revision, 9 unresolved discussions.


src/L10NSharp/L10NCultureInfo.cs, line 14 at r1 (raw file):

  /// class tries to create a CultureInfo object, and uses its values (for the most
  /// part) if successful.  If it fails, well, there are a few languages we know
  /// about that have been used for localization that we can handle in addition.

Suggest adding something like, "L10NCultureInfo objects for languages that are not known to Windows or Mono (typically those without 2-letter codes) and for which this class does not supply information will not produce very useful L10NCultureInfo objects. Therefore, this class should only be used for situations where we know only a limited set of languages will occur, such as a list of Bloom's UI languages. It should not be used where the name passed in will be an arbitrary vernacular language code"


src/L10NSharp/L10NCultureInfo.cs, line 28 at r1 (raw file):

              RawCultureInfo = null;
          }
          if (RawCultureInfo == null || RawCultureInfo.EnglishName == "Unknown Language")

== or StartsWith? Seems like I've seen reference to things like Unknown Language (tpi). Did you ever find out for sure whether "Unknown Language" will get localized in non-English versions of Windows?


src/L10NSharp/L10NCultureInfo.cs, line 35 at r1 (raw file):

              {
                  IetfLanguageTag = name;
                  TwoLetterISOLanguageName = name;

Is this right even if it is NOT a two-letter code? I suppose it's better than any available alternative.


src/L10NSharp/L10NCultureInfo.cs, line 43 at r1 (raw file):

                  IetfLanguageTag = name.Substring(0, idx);
                  TwoLetterISOLanguageName = IetfLanguageTag;
                  ThreeLetterISOLanguageName = IetfLanguageTag;

Rather trivial, but those last two lines would work for the other case, too.


src/L10NSharp/L10NCultureInfo.cs, line 71 at r1 (raw file):

                  NumberFormat = CultureInfo.InvariantCulture.NumberFormat;
                  break;
              }

Should we initialize NumberFormat to something in the case where we don't know anything?


src/L10NSharp/L10NCultureInfo.cs, line 87 at r1 (raw file):

                  NativeName = RawCultureInfo.NativeName;
              EnglishName = RawCultureInfo.EnglishName;
              DisplayName = RawCultureInfo.DisplayName;

Are there cases where any of these are null/empty and should be defaulted to one of the others?


src/L10NSharp/L10NCultureInfo.cs, line 147 at r1 (raw file):

              }
              if (!havePbu)
                  list.Add(GetCultureInfo("pbu"));

Seems unnecessarily complex. Why not put this line in the if above and drop the variable?


src/L10NSharp/L10NCultureInfo.cs, line 165 at r1 (raw file):

      }

      public override bool Equals (object obj)

.net coding standard says that if you override Equals, you MUST override GetHashCode(); it must return the same value for objects that Equals says are equal. Do we need to do something so that == and != will behave correctly?


src/L10NSharp/LocalizationManager.cs, line 330 at r1 (raw file):

          var ci = L10NCultureInfo.GetCultureInfo(langId);
          if (ci.RawCultureInfo != null)
              Thread.CurrentThread.CurrentUICulture = ci.RawCultureInfo;

Is there ANYTHING useful we can do about the thread culture in the other case? Seems it would be better to do something arbitrary but at least predictable...English or the Invariant culture...rather than leaving it set to whatever it was before.


Comments from Reviewable

StephenMcConnel commented 6 years ago

Review status: 0 of 5 files reviewed at latest revision, 9 unresolved discussions.


src/L10NSharp/L10NCultureInfo.cs, line 14 at r1 (raw file):

Previously, JohnThomson (John Thomson) wrote…
Suggest adding something like, "L10NCultureInfo objects for languages that are not known to Windows or Mono (typically those without 2-letter codes) and for which this class does not supply information will not produce very useful L10NCultureInfo objects. Therefore, this class should only be used for situations where we know only a limited set of languages will occur, such as a list of Bloom's UI languages. It should not be used where the name passed in will be an arbitrary vernacular language code"

Done. I edited your paragraph a bit and put it in a remarks section.


src/L10NSharp/L10NCultureInfo.cs, line 28 at r1 (raw file):

Previously, JohnThomson (John Thomson) wrote…
== or StartsWith? Seems like I've seen reference to things like Unknown Language (tpi). Did you ever find out for sure whether "Unknown Language" will get localized in non-English versions of Windows?

Done.


src/L10NSharp/L10NCultureInfo.cs, line 35 at r1 (raw file):

Previously, JohnThomson (John Thomson) wrote…
Is this right even if it is NOT a two-letter code? I suppose it's better than any available alternative.

I added a commented explaining this.


src/L10NSharp/L10NCultureInfo.cs, line 43 at r1 (raw file):

Previously, JohnThomson (John Thomson) wrote…
Rather trivial, but those last two lines would work for the other case, too.

Done.


src/L10NSharp/L10NCultureInfo.cs, line 71 at r1 (raw file):

Previously, JohnThomson (John Thomson) wrote…
Should we initialize NumberFormat to something in the case where we don't know anything?

Just in case someone blindly calls this with a truly unknown language and then expects number formatting to work. I'd rather a user/tester complain that a new localization displayed as "Unknown Language" than complain that it did that and also crashed creating PDFs or whatever it is that uses number formatting. THe InvariantCulture.NumberFormat seemed like a safe choice to prevent crashing at least.


src/L10NSharp/L10NCultureInfo.cs, line 87 at r1 (raw file):

Previously, JohnThomson (John Thomson) wrote…
Are there cases where any of these are null/empty and should be defaulted to one of the others?

I don't know of any such cases. In Mono, CultureInfo.DisplayName always explicitly returns EnglishName. I'm not sure how .Net works, but I would be surprised if it (or Mono) returned null for any culture that it claims to know about for any of these names.


src/L10NSharp/L10NCultureInfo.cs, line 147 at r1 (raw file):

Previously, JohnThomson (John Thomson) wrote…
Seems unnecessarily complex. Why not put this line in the if above and drop the variable?

I think you're missing the ! (not) operator. I have to scan the entire list to verify that the item does not exist before adding it. I suppose I could create each of the objects and add them only if they were not already in the list. But this approach avoids adding a (tiny) amount of wasted effort for when the system actually knows one or more of the languages we're adding if needed. (I think Windows .Net does know about Dari. I don't know if Mono is aware of any of the three languages we're adding.)


src/L10NSharp/L10NCultureInfo.cs, line 165 at r1 (raw file):

Previously, JohnThomson (John Thomson) wrote…
.net coding standard says that if you override Equals, you MUST override GetHashCode(); it must return the same value for objects that Equals says are equal. Do we need to do something so that == and != will behave correctly?

Done.


src/L10NSharp/LocalizationManager.cs, line 330 at r1 (raw file):

Previously, JohnThomson (John Thomson) wrote…
Is there ANYTHING useful we can do about the thread culture in the other case? Seems it would be better to do something arbitrary but at least predictable...English or the Invariant culture...rather than leaving it set to whatever it was before.

I think this setting affects only system dialogs. Perhaps you're right. I'll try InvariantCulture and we'll see if anyone ever complains.


Comments from Reviewable

JohnThomson commented 6 years ago

Couple more questions...feel free to merge as-is if you don't think they're worth doing anything about.


Reviewed 3 of 5 files at r1, 2 of 2 files at r2. Review status: all files reviewed at latest revision, 4 unresolved discussions.


src/L10NSharp/L10NCultureInfo.cs, line 71 at r1 (raw file):

Previously, StephenMcConnel (Steve McConnel) wrote…
Just in case someone blindly calls this with a truly unknown language and then expects number formatting to work. I'd rather a user/tester complain that a new localization displayed as "Unknown Language" than complain that it did that and also crashed creating PDFs or whatever it is that uses number formatting. THe InvariantCulture.NumberFormat seemed like a safe choice to prevent crashing at least.

Sorry, I was trying to get you to do what you'd already done. Somehow I missed that there is a default case.


src/L10NSharp/L10NCultureInfo.cs, line 147 at r1 (raw file):

Previously, StephenMcConnel (Steve McConnel) wrote…
I think you're missing the ! (not) operator. I have to scan the entire list to verify that the item does not exist before adding it. I suppose I could create each of the objects and add them only if they were not already in the list. But this approach avoids adding a (tiny) amount of wasted effort for when the system actually knows one or more of the languages we're adding if needed. (I think Windows .Net does know about Dari. I don't know if Mono is aware of any of the three languages we're adding.)

Yes, somehow I was mis-reading what you have. Sorry.


src/L10NSharp/L10NCultureInfo.cs, line 76 at r2 (raw file):

                  break;
              default:
                  NativeName = "Unknown Language";

Missed this before, sorry...but should we make them all something like "Unknown Langauge (tpi)? Or even just use the code without emphasizing that it's unknown?


src/L10NSharp/L10NCultureInfo.cs, line 182 at r2 (raw file):

          return (that.Name == this.Name) &&
              (that.EnglishName == this.EnglishName);
      }

I still think you should provide an implementation of == and !=. I find it very counter-intuitive that new L10NCultureInfo("fr").Equals(new L10NCultureInfo("fr") is true, while new L10NCultureInfo("fr") == (new L10NCultureInfo("fr") is false. Things I find on the web are ambivalent...MS apparently prefers not to mess with ==, but acknowledges it is appropriate in some cases, particularly immutable objects, which I believe yours is. Others agree with me. It's up to you...I won't insist, but if you don't do it I think a warning in the class comment might be appropriate.


Comments from Reviewable

StephenMcConnel commented 6 years ago

Review status: all files reviewed at latest revision, 4 unresolved discussions.


src/L10NSharp/L10NCultureInfo.cs, line 76 at r2 (raw file):

Previously, JohnThomson (John Thomson) wrote…
Missed this before, sorry...but should we make them all something like "Unknown Langauge (tpi)? Or even just use the code without emphasizing that it's unknown?

I made them all "Unknown Language (cod)".


src/L10NSharp/L10NCultureInfo.cs, line 182 at r2 (raw file):

Previously, JohnThomson (John Thomson) wrote…
I still think you should provide an implementation of == and !=. I find it very counter-intuitive that new L10NCultureInfo("fr").Equals(new L10NCultureInfo("fr") is true, while new L10NCultureInfo("fr") == (new L10NCultureInfo("fr") is false. Things I find on the web are ambivalent...MS apparently prefers not to mess with ==, but acknowledges it is appropriate in some cases, particularly immutable objects, which I believe yours is. Others agree with me. It's up to you...I won't insist, but if you don't do it I think a warning in the class comment might be appropriate.

Done.


Comments from Reviewable

JohnThomson commented 6 years ago
:lgtm:

Reviewed 1 of 1 files at r3. Review status: all files reviewed at latest revision, 2 unresolved discussions.


Comments from Reviewable