mui / material-ui

Material UI: Comprehensive React component library that implements Google's Material Design. Free forever.
https://mui.com/material-ui/
MIT License
93.38k stars 32.14k forks source link

[l10n] Create a whitelist of support locales #39525

Open oliviertassinari opened 11 months ago

oliviertassinari commented 11 months ago

What's the problem? 🤔

We are currently adding new locales as developers are proposing them, however, we are not very clear on which locale we support. In https://mui.com/material-ui/guides/localization/#supported-locales, we say:

Screenshot 2023-10-19 at 17 34 14

Developers are left wondering, will my locale be accepted? Which locale is missing?

What are the requirements? ❓

I think it could be awesome to:

Context

This strategy would apply for MUI X too, cc @joserodolfofreitas

flaviendelangle commented 11 months ago

Are we sure we want to refuse any locale that is not in the TOP100 ?

For X, it would mean removing the Hebrew (TOP128), the Icelandic (not in the TOP200), the Czech (TOP104), the Norwegian (TOP171), the Danish (TOP165), the Catalan (TOP129) and maybe others.

This threshold feels very arbitrary

And if we keep the existing locales and just refuse new ones, it's not coherent either.


What's the main reasoning behind the idea of limiting the locales?

oliviertassinari commented 11 months ago

@flaviendelangle A good number of these examples are above the 0.1% website use threshold of https://w3techs.com/technologies/overview/content_language, so I think we can keep these.

The most important for me is that we are upfront with the locales we miss, and say that by default all others won't be accepted, having a case-by-case acceptance (it doesn't have to be top 100). Today, it feels like we default to accept all the locales that are proposed, I'm worried that a locale spoken by "only" 1M people has less impact than spending that time solving another problem. Nothing prevents developers from translating the components, but for introducing features, and solving bugs, we, maintainers are often the bottleneck.

What's the main reasoning behind the idea of limiting the locales?

flaviendelangle commented 11 months ago

If the criteria is 0.1% of websites usage, then only Icelandic falls short

My main point is that, without any clear limitation, people are not creating tons of "useless" locales (on X at least), so the pain-point feels very theoretical to me and I'm not sure it's worth refusing a locale.

The opportunity cost of refusing it is not null, we may even spend more time explaining the the user(s) why we can't accept his PR than just reviewing it.

And even if used by 1M people, if we have some potential paying users in this country, seeing that their locale is available can be a differentiating factor. If we start refusing locales, we should probably make very clear that people can easily create their own. This is something the current core doc covers very briefly and even the X doc is probably not enough beginner-friendly.

This is mostly from the perspective of X of course.


For your idea of listing the main missing locales, I totally agree that it could be super useful.

oliviertassinari commented 11 months ago

Maybe we could move forward like this:

  1. Create a sorted list of the top 100 locales we want to support. Material UI currently has about 56, MUI X 33. We still have a lot of room to grow into.
  2. Make this list easy to access, and have all MUI's open source projects align on it. Encourage the community to contribute to them.
  3. By default reject locales outside of the list or only by trading one for another. Once we reach this 100 threshold limit, for enough time, reconsider.

If 1. is too much work, then at a minimum, have:

alexfauquette commented 10 months ago

Create a sorted list of the top 100 locales we want to support.

Agree on that one. Since we are now able to display the completion level it could be an incentive

100 might be a lot. I had a look at our users data. It only contains 42 different locales:

Locales of our users
  1. English
  2. Spanish
  3. Russian
  4. Portuguese
  5. Chinese
  6. French
  7. Japanese
  8. Korean
  9. German
  10. Vietnamese
  11. Polish
  12. Italian
  13. Turkish
  14. Ukrainian
  15. Swedish
  16. Thai
  17. Dutch
  18. Indonesian
  19. Hebrew
  20. Czech
  21. Norwegian Bokmål
  22. Hungarian
  23. Arabic
  24. Finnish
  25. Danish
  26. Romanian
  27. Slovak
  28. Greek
  29. Persian
  30. Bulgarian
  31. Croatian
  32. Catalan
  33. Lithuanian
  34. Serbian
  35. Slovenian
  36. Uzbek
  37. Estonian
  38. Hindi
  39. Norwegian
  40. Latvian
  41. Azerbaijani
  42. Belarusian

The opportunity cost. It's time we spend reviewing PRs that we could spend on more impactful things.

It's quite a straightforward review process since we are not able to speak those languages. Except when they don't run l10n or prettier scripts which implies running few command lines, but that could be automated

LukasTy commented 10 months ago

After hearing all the arguments, I'm not sure we are in need of a 100 list. IMHO, the only slight problem that could be improved is the consistency between Material and X. There are some locales, that only exist in X, but do not in Material and vice versa. Maybe we could incentivize the introduction of locales to align the list by showing some sort of placeholder entry for locales that exist in the other package but are not in the package in question?

This is a rough idea representation:

Screenshot 2023-11-03 at 10 04 15

The complete idea would be:

  1. Align the design of supported locales tables between Material and X
  2. Include missing locales from Material in X table
  3. Include missing locales from X in Material table
  4. Design the missing entries in a clear way
  5. Keep accepting locale contributions from the community and syncing tables for now
  6. (Optional) Allow hiding (toggle) the not-existing locales in the table

WDYT about such a middle-ground idea @oliviertassinari @flaviendelangle @alexfauquette?

oliviertassinari commented 10 months ago

@LukasTy I like the push to align the locales:

alexfauquette commented 10 months ago

About the limit of 100, do you have a list of those locales? I did not found one

oliviertassinari commented 10 months ago

@alexfauquette No specific list in mind for these 100 locales. I trust the closest we get to the limit, the more we will try to optimize for the likelihood of a new locale to be used.

igorbga commented 10 months ago

@oliviertassinari as one of those speakers of a language known by only around 2 million people or less (Basque, eu_ES) I found your proposal of explicitly ignoring any minoritarian language as quite gross, offensive and discriminatory. I hope and suspect that that was not your intention and will try to keep my comment as constructive as possible but I thought that it was worth mentioning what any reader of one of those minoritarian languages might probably feel when reading your comments.

I'm not very fan of the political correctnes that nowadays impregnates everything but the fact that you not only not encourage but propose to exclude those locales intentionally feels really, really bad.

It doesn't improve either when in your second comment you reconsider the criteria to select the 100 languages due to the fact that some of the languages left out feel too first citizen languages for you unlike the others that have more speakers and would be left out to introduce your "favourite" ones.

I can understand that there are practical reasons but are they really worth being so offensive to a whole group of communities, even if they are just a few million people each?

Now let me suggest some possible improvements or suggestions in order to alleviate some of the issues you mention. I'm well aware that I know nothing about Material UI and its internals so my proposal might probably be unfeasible or too complicate, but just to make sure that you have at least considered them.

Bundle size:

Wouldn't it be possible to package some of the less frequent locales in a different package or module so that they are not bundled nor present by default but can be explicitly imported for developers who want to support those extra locales? Maybe they could even be imported one by one instead of all of them at once.

PR review process:

I won't deny that this will take some time but, how many new locales do you receive in a year ? Beside as you already stated your review process for those PRs is quite straight forward as it does not include a thorough review by other native speakers of the language.

Thanks

LukasTy commented 10 months ago

Thank you for the reply @igorbga. I'm sorry if you got offended, I'm pretty sure that was not what Olivier intended. 🙈 I think that the main reason he is pushing for a cap on locales we bundle is the bundle size and/or dev performance.

Let's start with the fact that anyone can author and maintain their desired locale on their end, and in some cases, it might be even more beneficial if a fast feedback (fix) cycle or custom translation is required.

As for the bundle size, IMHO, even removing the top-level export (export * from './locales') would help a lot. Currently, every new locale artificially inflates the bundle size of our package. Having a big bundle size is definitely not a positive sign in the ecosystem. 🙈

oliviertassinari commented 10 months ago

I think that the main reason he is pushing for a cap on locales we bundle is the bundle size and/or dev performance.

@LukasTy To some extent yes, but I'm also worried about the opportunity cost, and who is the bottleneck. Take it from the perspective of a user whose locales are not supported: Would they rather have their locale supported or have 3 bugs fixed? They can create their own locales, but they most often can't fix these bugs on their own. So I think it's better to set a hard limit and reevaluate once we reach it if it's how we have the most impact or not.

I'm not very fan of the political correctness that nowadays impregnates everything but the fact that you not only not encourage but propose to exclude those locales intentionally feels really, really bad.

@igorbga I have rephrased my comment to get closer to the root problem. I have also reiterated it above. I hope it makes more sense under this lens, it's unfortunate that reading my comments didn't feel great. The goal is to maximize the impact while operating with limited resources.

I think that it could make sense to have a community project whose purpose is to support the long tail of the locale. For example, iOS seems to support over 400 locales https://gist.github.com/jacobbubu/1836273.

flaviendelangle commented 10 months ago

Would they rather have their locale supported or have 3 bugs fixed?

The camparison is flawed, 1 locale takes a lot less time than 3 bug fixes, even a lot less time than most of the bugs (on the codebases I worked on at least).

I feel like we are focusing on a point of our work which (for X at least) represent a very small fraction of the overall time spent. The vast majority of our time is spent doing other things than reviewing and maintaining locales. And among the the time we spend on locales, the vast majority is spent on ones that would be in the whitelist anyway.

So for me, the opportunity cost is negligible and limiting the locales would send a very bad message to our community.


Concerning the bundle size, we discussed it in the eXplore weekly meeting today and we will try to modify our bundling strategy to allow imports of the individual locales (imports of depth 2). See https://github.com/mui/mui-x/issues/10920 for more details.

oliviertassinari commented 10 months ago

1 locale takes a lot less time than 3 bug fixes

@flaviendelangle ok, maybe I'm unfair. It might be closer to reality if we take the cost of having the locale for each component, Core, Data Grid, Charts, the future components, and where we ourselves finish these locales to have clean ones to use after refactoring vs. small bugs.

In this case, I think that we can continue with more or less the previous plan: we sync locales between MUI projects, we accept all locales proposed, e.g. Catalan, and once we hit 100, we will see if there is a need to change the threshold.

We are far from the threshold, I doubt we will reach it any time soon. Maybe at 100, we will have solved the bundle size issue, maybe nobody will propose more locales, maybe we will have 50% of the locales not fully translated so we would remove a couple, maybe we will feel we can easily support 200, etc. unknown.

Another benchmark, 600+ locales for the Intl API: https://stackoverflow.com/a/73877141/2801714