atk4 / ui

Robust and easy to use PHP Framework for Web Apps
https://atk4-ui.readthedocs.io
MIT License
443 stars 106 forks source link

i18n support #1833

Open mkrecek234 opened 2 years ago

mkrecek234 commented 2 years ago

In Atk4/Ui a number of strings like "Save", "Add", or "Delete" are hard-coded and thus localisation is not easily possible.

I suggest we make it centrally configurable in uiPersistence, but not sure what the best approach is:

In uiPersistence, We define a function _tUi(...) which by default returns null, but can be configured to be a callback function with the same parameters as _t(...) function in Atk4/i18n.

In each view/control where a string is needed (example form save button), we then check if _tUi('ui.form.button.save') returns something other than null, if not we use the hard-coded string still. This makes it flexible to use any translation repo while keeping full backward compatibility if no localisation is needed.

@mvorisek if you could help to define the function in uiPersistence and make one coding example what to change in a control where a hard-coded string is needed, I could do the nasty work to implement this for all Ui hard-coded strings.

I suggest we make a clear logic how to name the strings like in the example above.

mvorisek commented 2 years ago

Please provide several repro/demo examples where the strings are hardcoded.

mkrecek234 commented 2 years ago

Form's save button: https://github.com/atk4/ui/blob/01d39d24694e4b4f63a576d090e55e0e97aaa9e4/src/Form.php#L76 EceutorFactory: https://github.com/atk4/ui/blob/01d39d24694e4b4f63a576d090e55e0e97aaa9e4/src/UserAction/ExecutorFactory.php#L56

mvorisek commented 2 years ago

Thank you and 👍 for posting links /w commit hash (instead of a branch that can change and line numbers can be out of date).

In the Form example, the value is set as property default, so no call to a method is possible here. What do you propose?

mkrecek234 commented 2 years ago

We should then remove the default in the property definition and load it in the init function by checking if _t function is null or otherwise set to 'Save' in this example, so somehting like

$this->buttonSave = [Button::class, $this->getApp()->uiPersistence->_tUi('form.button.save') ?? 'Save', 'class.primary' => true];

For the translation reference we might to define an automatic name that is generated by the classes name eventually? So 'form.button.save' is rendered automatically? Just brainstorming...

mvorisek commented 2 years ago

That would break DI if done in init, so need to be done in constructor, but that would break the possibility to override the variable default in child class.

So it would need to be a fallback... Quite hard to read.

Maybe be should define some grammar like __atk4_translate(atk4.ui.form.button.save)__ or __atk4_text(Save)__ which will be replaced by regex during rendering.

mkrecek234 commented 2 years ago

By fallback you mean, that it is only replaced if not yet set? Grammar and replace in rendering sounds also very elegant. Your take, what coding structure would work best. Today we already have some configuration through uiPersistence->dateFormat etc., so it makes sense to me to keep the general structure eventually the same?

mvorisek commented 2 years ago

By fallback you mean, that it is only replaced if not yet set?

yes

Your take, what coding structure would work best.

quite complex thema, to support plural (in Czech for example, we have even different endings for {2, 3, 4} vs. {5, 6, ...})

normally translations are done with patters like 'Add %n pieces to basket' so the translation can reorder the words or even the variables

Today we already have some configuration through uiPersistence->dateFormat etc., so it makes sense to me to keep the general structure eventually the same?

you mean moving formatting into localisation?

https://www.lionbridge.com/blog/translation-localization/localization-globalization-internationalization-whats-the-difference/

we should keep this in head, but for now, I would focus mainly on the simplest - translation

mkrecek234 commented 2 years ago

Agree - any way to open up translation possibilities is better than what we have today. I think singular/plural things are very well considered if you look in atk4/i18n or the underlying Symfony repo - we can still think about to transmit further parameters to the translate function later on which allows for correct grammar/singular/plural stuff.

No don't mean to move formatting to localisation, only that uiPersistence's properties is a good place to also configure any translate function.

mvorisek commented 2 years ago

I have not used atk4/i18n in any project, does this library support pattern translations (like Add %n pieces to basket) or does it require translations id (form.button.save)?

If we have a lib /w pattern translations support, we can fuzzy tokenize the current render based on html/common text delimiters with a regex and "try translate" these matches. Of course, if some company would be named like "Name: next" it would be translated wrongly to: "Name: weiter". Would this approach "almost solve this for you" or "better to stay with English then"?

If larger changes are needed, this must target 4.1 release or later, as I would like this stable release v4.0 to be done soon and there is still a lot of work to do.

mkrecek234 commented 2 years ago

Atk4/i18n is a wrapper for Symfony\Translation made by @ibelar - this supports pattern translations, and also simple logic like male/female or other conditional texts. I'd think this is using Symfony/Translation is very good practice.

The function is defined function _t(string $id, array $param = [], string $domain = null, string $locale = null): stringusing an id (e.g. atk4.ui.form.button.save), allowing to pass any number of parameters which then can be used in the translation files to return the proper string; $domain can structure translations in different files (e.g. one domain for atk4/ui, one domain for atk4/filestore etc.) - $locale should ne null. I would recommend to then use an id logic 'form.button.save' while having a standard $domain = 'atk4.ui'

Here is a sample translation entry giving a glimpse how complex translations are possible:

    // use as _t('finish_result', ['result' => 2]);
    'finish_result' => 'Vous avez terminé {result, ordinal,
        one
        two
        few
        other
        }!',
    // the 'other' key is required, and is selected if no other case matches
    'invitation_title' => '{organizer_gender, select,
        female {{organizer_name} aimerais vous invitez à son party!}
        male   {{organizer_name} aimerais vous invitez à son party!}
        other  {Nous, {organizer_name}, aimerions vous invitez à notre party!}
    }',
    'published_at' => 'Publié le {publication_date, date} - {publication_date, time, short}',
    'progress' => 'Le travail accompli est de {progress, number, percent}!',
    'value_of_object' => 'Cet object vaut la somme de {value, number, currency}',
    'num_of_apples' => '{apples, plural,
        =0    {Le panier ne contient aucune pomme.}
        one   {Le panier contient une pomme.}
        other {Le panier contient # pommes.}
    }',
mkrecek234 commented 2 years ago

If we can achieve where today hardcoded strings are present, instead - if defined - a translate function is called passing an id like form.button.save as parameter and checking if something is returned, that would be perfect.

I think parametrisation (singular, plural) is not needed at this stage, but then could very easily be extended: You would just need to pass e.g. the number of deleted records to the translate function. Done.

I would not fancy thus any default translation of words in rendering, because atk4/i18n / Symfony/Translation is so much better already to do what we need.

mvorisek commented 2 years ago

by "pattern translations" I meant what I provided with an example - "Add %n pieces to basket" - can this be passed to atk4/i18n or is translation ID required?

function _t(string $id, array $param = [], string $domain = null, string $locale = null): string prototype requires we call this function on all places where rendered - can we do that? (read more for usages in atk4/data and core)

https://github.com/atk4/ui/blob/01d39d24694e4b4f63a576d090e55e0e97aaa9e4/src/Form.php#L378 many methods to create label/header/description require a string, what would be accepted now? A translation ID and fallback to the input string? If so, how would we detect missing translation ID?

https://github.com/atk4/data/blob/d8cd47e77399263b7056e6a2cf3b235e4caf47b9/src/Model/UserActionsTrait.php#L129 some texts come from atk4/data

or even from https://github.com/atk4/core/blob/develop/src/ReadableCaptionTrait.php#L14 from fuzzy class name to caption string conversion, how can this be translated?

where the dependency of atk4/i18n should be added? into atk4/core directly? if so, should the atk4/i18n repo be merged in?

mkrecek234 commented 2 years ago

You always pass an id and parameters - the _t function will then lookup the id and get the pattern you mention from the language file an insert the parameters at the placeholder. See examples above.

If captions are used, there is no need to translate, as the developer can define captions as he likes including using _t function.

All methods requiring a string will be unchanged - the developer will pass a localised string using his own translate logic.

I would definitely not fold in atk4/i18n - Why can't we just call a closure function and pass the id (and later eventually parameters where needed) - if null returned, use the harcoded string as before.

It is then up to the developer to define the closure: tie it to atk4/i18n or use any other localisation repo he likes.

If strings are in core I would leave them in there for now - in theory only strings should be defined where a return of data is done towards the user. Not sure why strings come from atk4/data, ideally localised strings only exist in atk4/ui.

mkrecek234 commented 2 years ago

So in brief: only where today is a hardcoded string to be localised, we try to fetch a localised string calling (if defined) a closure in Persistence/Ui - or take the old hardcoded string if no localised is offered.

mkrecek234 commented 2 years ago

Whenever core or data are using class name to describe, this is perfectly fine and remains like that. Today, you can always override eg a model name as a descriptor by setting model::caption.

So this should not be an issue for us in Ui.

I think we should keep it really simple - while just offering to attach a translation function optionally and not mandatory. There are not that many hard-coded strings anyhow...

mkrecek234 commented 2 years ago

The alternative would be to move all hardcoded strings to be configurable in the respective class (Form, Crud) or in Persistence/Ui. But this seems to me not so nice, especially if we have more than 5-10 strings. Surely though, this is the simplest way.

mvorisek commented 2 years ago

I see what you want - a translation method in ui persistence, which would be invoked from every place where an ui string is built [1].

This is reasonable.

In the ui persistence, the translation method can return the (untranslated) input by default. It should not return null, otherwise we will have to maintain en/default translations on two places.

When [1]/built = hardcoded in code statement (not a variable or property), this is quite simple & clear logic as long as we are in atk4/ui and projects using this dep.


But as long the [1]/hardcoded string can be a variable (a property, array seed or even an exception), it is complex.

The hard part - as I pointed with some examples above - is how to define the default/en string (or translation ID) and when to translate it - if we translate it at the source [2], how would we prevent to translate them again in upper layers?


What is [2]/source?

exceptions

In atk4/core there are hardcoded strings in exceptions. In core/data level, no translation is needed. Thus it is quite fine to translate them once they should be displayed in atk4/ui.

Problem: if an exception is not nested but rendered somewhere like $e->getMessage() [3], the exception message would be untranslated.

Warning: given this approach, even atk4/ui messages must be never translated. Otherwise they might be translated twice.

validation/data messages

In atk4/data we generate validation exceptions, but IIRC, some of them are converted to string there/[3].

So we need to store them in objects.

autocaptions

Automatic captions are important part of low code framework and data model separation from ui.

Atk4/data, mainly Model::table name and Field::shortName (but also short name computed from reference id like contact_id => Contact) should be translated at a central place.

Translating them at atk4/data Model or Field would require App/Translation service available at data level. This would be bad and not even possible in many projects.

So in atk4/ui we need to translate atk4/data definition names (but not model data itself).

Warning: same as for exceptions, even when atk4/data names are defined in atk4/ui or custom projects, no definitions of atk4/data object properties must be translated.


This post took me quite of time, but now it should be clear where to define untranslated/english text, where to translate (when defining texts to ui components) and where the translation should happen.


Integration steps needed:

mkrecek234 commented 2 years ago

For validation messages that are rendered by Exception, that should be easy as they are defined in the developers user-defined models or views. It is easy to make available the _t function there and pass over the translated exception text to data - I did this already.

Same is true for all field names and model captions, as those are also just define using the caption property which already defines it localised. Also this I did already, works great.

So to me it comes down to all Ui strings with the challenge for the default ones defined as property variables...

mvorisek commented 2 years ago

With the analysis above, Atk4\Data\Field caption should not be translated. Let's not discuss this case as the best solution might pop out during implementation.

What I stress is this:

Imagine company xxx having 2 repos. xxx/models and xxx/ui.

In xxx/models, the company defines all models (DB relations, views, migrations, ...). This repo requires (only) atk4/data, thus no translation is possible here. Even if it would require atk4/ui, the models does not (and should not) have access to initialized ui persistence nor i18n service.

Then in xxx/ui, the company defines UI to interact with these models, and requires atk4/ui and xxx/models. With i18n support, the company should be able to define translations for the models from xxx/models repo. Changing captions in each model here is almost impossible (even if you create a model and set translated caption for each field, inner joined or referenced models would be untranslated) and you would always have to create/pass the models thru some custom factory.

In some large projects I use this split. Here I specify it especially to stress the design goal, to keep the atk4/model separated from ui. With the analysis in my last post, you should be able to define 1 translation like Quantity -> Menge and it would apply for every model (Invoice, Inventory, ...) that has field quantity.

mkrecek234 commented 2 years ago

If you separate into two repos, I think conceptually you in your projects then should set atk4/data model/field captions from your Ui repo when you instantiate them (thus also there you can then set it localised using the user's locale).

By definition, data should not be dealing with user interactions, thus also not translations. However, the caption properties in atk4/data are in a way a "link towards users/UX". So I believe the current structure is great, also for localisation.

I would never do "translations", like "replace quantity by Menge", this leads to many grammatically wrong translations/messages in many languages. For example the English word "volume" needs to be translated in different ways in German, depending on the meaning: Ausgabe, Volumen. Ausgabe in German could mean in English "volume" or "expense" or "edition" with completely different meanings.

The concept of using an identifier like model.order.quantity.caption or form.saveButton.caption or CustomerView.title and then linking to a localisation repo like i18n to me is best-practice.

So default caption of a save button should be: ($this->getApp()->uiPersistence->_tUi('form.saveButton') ?? 'Save') to me. Eventually we can create an automated identifier based on the class names, so we do not have to hard-code the translation identifier everywhere? So for controls in forms it tries to lookup a translation identifier with the auto-generated model-field-name identifier model.customer.quantity, or if not found them uses the model's field name. By default the _tUi function always returns null if not configured (that means no localisation implemented). That would be powerful and highly flexible! What do you think? @ mvorisek?

mvorisek commented 2 years ago

If you separate into two repos, I think conceptually you in your projects then should set atk4/data model/field captions from your Ui repo when you instantiate them (thus also there you can then set it localised using the user's locale).

this is impossible from design perspective, as for ex. reference seeds are defined like: model => [OrderModel::class]

I would never do "translations", like "replace quantity by Menge", this leads to many grammatically wrong translations/messages in many languages. For example the English word "volume" needs to be translated in different ways in German, depending on the meaning: Ausgabe, Volumen. Ausgabe in German could mean in English "volume" or "expense" or "edition" with completely different meanings.

this is valid, maybe we should provide optional context

above, I have written:

With the analysis above, Atk4\Data\Field caption should not be translated. Let's not discuss this case as the best solution might pop out during implementation.

maybe we should translate caption (commit from data) as well, then, you will be able to define it like volume - expenses and translate even the English text if needed

'form.saveButton') ?? 'Save'

it is basically the same question.

I hate this syntax as it needs every text to be "defined twice".

So I prefer the translation method to always return a string, for CI, we can require all text to be defined, for standard use, we can passthru the input string if no translation is available.

mkrecek234 commented 2 years ago

So I prefer the translation method to always return a string, for CI, we can require all text to be defined, for standard use, we can passthru the input string if no translation is available.

No, since the translation method requires an identifier, whereas the return string has to be a localised text. And if you did not configure translation, or the text is missing, then we would want the same nice captions that Atk4/Ui today renders, e.g., based on the field or model name: so field article_id becomes Article ID. If a localisation could be found for auto-generated _t('model.article.article_id', $domain='atk4.data', $localeOfUser)='Artikelnummer' then it should take this caption.

As said, you cannot "translate" in the sense to pass the field name to the _t function and hope it gets the context. That's not feasible due to missing context and also not needed.

Let's not call it a "translated string" that the _t function returns, but it just returns localised string based on an identifier. As said, there is no way to translate based on translate('English term', 'de-DE') to German, it needs to be _t('model.article.article_id', 'en-US') and _t('model.article.article_id', 'de-DE').

mvorisek commented 2 years ago

No, since the translation method requires an identifier

we need to update it

during refactoring of atk4/core I have also noticed there is some translation logic - https://github.com/atk4/core/tree/develop/src/Translator and some preparation to translate exceptions - https://github.com/atk4/core/blob/3d80f2a57a87c1577eba9734b0b4f291a721f44b/src/Exception.php#L26

that iface must be implemented by atk4/i18n

these theps are already mentioned in https://github.com/atk4/ui/issues/1833#issuecomment-1229208313

mkrecek234 commented 2 years ago

'form.saveButton') ?? 'Save'

it is basically the same question.

I hate this syntax as it needs every text to be "defined twice".

It is not translated "twice" if you use default atk4/ui without any i18n repo. We talk here about maximum 10-20 strings which are existing today in the Ui views, like "Add", "Delete", "Edit", "Save" or so. It is not justified for that to force people to use i18n always - in unlocalised Atk4/Ui applications, this would be as fast as today:

$fieldCaption = ($this->getApp()->_tUi ? $this->getApp()->_tUi($this->autogeneratedIdentifier) : defaultName_how_it_is_rendered_today_byAtk4Ui)

$buttonSaveCaption = ($this->getApp()->_tUi ? $this->getApp()->_tUi('form.saveButton') : 'Save')

mvorisek commented 2 years ago

It is not justified for that to force people to use i18n always

good point

still, I do not like 'form.saveButton') ?? 'Save' as it introduces some semi translation, there should be no translation at all or full support

mkrecek234 commented 2 years ago

still, I do not like 'form.saveButton') ?? 'Save' as it introduces some semi translation, there should be no translation at all or full support

This is why we should only test first if $this->getApp()->uiPersistence->_tUi is defined or not. If it is defined (=i18n support enabled), then call that closure, if not go with the default value. May be there are more elegant ways to code this logic - but I liked to concept to start an untranslated app using the handy Atk4/Ui automated rendering of captions based on field names, or default strings. And at a later stage you simply "plug-in" i18n support, and add strings step-by-step where you want them. I would definitely exclude exception strings rendered by data/core here: they are meant for the developer, and not for the user.

mkrecek234 commented 2 years ago

@mvorisek Another thought on this - why don't we turn it around and on the Ui side getting any model/field captions like this:

The function getI18nIdentifier should be available for model and for fields at least. For the static texts in many Ui components ('Add', 'Delete' etc.) we probably have to define also the i18n identifier statically, like ui.form.buttonSave or ui.crud.buttonAdd

mvorisek commented 2 years ago

Are you defending translations by ID primary because atk4/i18n does not support is currently?

I understand your example with "volume", but as long as you extend the source text + translate default/English text, it is not an issue. Strictly said, this option will allow you to specify anything, also an "ID", as long as you translate these source texts...

mkrecek234 commented 2 years ago

Are you defending translations by ID primary because atk4/i18n does not support is currently?

I understand your example with "volume", but as long as you extend the source text + translate default/English text, it is not an issue. Strictly said, this option will allow you to specify anything, also an "ID", as long as you translate these source texts...

atk/i18n is structured around identifiers like ‚model.article..article_name‘ - and this is what I mean by identifier. With this you can specify anything in any language. And also passing any parameter to be incorporated in the localised string.

Please check the sample file I posted above from atk4/i18n for the translation file - these are the identifiers we need to use. And these can be auto-generated based on model name and field name easily.

Or the demo file in the repo https://github.com/atk4/i18n/blob/develop/demos/languages/fr_FR/messages.fr_FR.php Here you have multi-level identifiers like record.ask.deletein the sample.

bedengler commented 1 year ago

If that helps: In my app I have added a function called "t":

function t($stringtotranslate) {
  if(defined($stringtotranslate))
    return constant($stringtotranslate);
  else
    return $stringtotranslate;
}

All strings are defined as constants in language files. E.g.:

define("General", "Allgemein");
define("User", "Benutzer");

Now any button I'm using or anywhere where I have a string, I'm using t(). E.g.: Header::addTo($gr1, [t('General')]); Very simple and very effective. Because by setting a new language file, the whole app changes language wherever I implemented t().

As we are talking about pretty standardised names (without any fancy functions around), that could be a variation with low effort but some flexibility. It's at least better than now, where some buttons are just untranslatable...