Language translation - Githubissues

isedwards commented 7 years ago

Microsoft provides a free tool for Visual Studio developers called the Multilingual App Toolkit (MAT) to make language translation very simple.

When we originally tried using the tool we had a problem with the way it automatically changed lots of resource files each time the project was loaded. With further testing, this issue has now been resolved.

I'd like to add the MAT tools back into the project and quickly complete the language translations (there will still be some work to make sure labels etc. are large enough to fit the translated text).

mhabimana commented 7 years ago

Thanks Ian, That would be great. I support the idea.

isedwards commented 7 years ago

Hi @smachua, are you happy that we go ahead with this? The language_translation table currently only contains 9 translations, it would be a lot of work to add all of the TagIDs and translations for everything in Climsoft - The Multilingual App Toolkit takes care of recording what needs translating automatically, so we don't need to manually enter all of the tags. Microsoft has put a lot of work into making it work very well.

mhabimana commented 7 years ago

@isedwards - Any progress on this? @smachua has not yet react on this. @smachua - Can you let us know your view on this?

isedwards commented 5 years ago

Regardless of how we complete language translation, a major part of getting this finished is making sure that the text labels are large enough to fit the French and Portuguese translations. @mhabimana has offered to work on this. I will create a pull request on the dev development branch that adds automatic translations of everything in the interface using google translate so that Marcellin will be able to see how large the label need to be.

isedwards commented 5 years ago

The french_translation2 branch has all of the .Text properties in the software automatically translated into French using the Microsoft Translator Text service.

In the image below, @mhabimana has highlighted some of the problems with labels now not being wide enough for the French translation.

We need to work through all of the forms to make sure that the labels are long enough for other languages including French and Portuguese.

@opencdms/climsoft-tag - Is this something that should be done during the March workshop (so that we can sort out any conflicts that arise from so many forms all being changed at the same time and make sure that everyone still has a working copy of the software at the end of the week)?

french_labels

mhabimana commented 5 years ago

@isedwards This sounds great. I support the idea.

smachua commented 5 years ago

@isedwards and @mhabimana this impressive. We shall try and resolve any issue during the workshop.

isedwards commented 5 years ago

Thank you Samuel - I've updated the french_translation2 branch so that it now also has Portuguese (using the Microsoft Translation service).

For testing, the language choice now appears before the login screen, but we can change that later.

@Steve-Palmer may be able to find someone to help to check and improve the Portuguese translations and @mhabimana is able to help improve the French version.

portuguese

mhabimana commented 5 years ago

@isedwards :Well done!

isedwards commented 5 years ago

After extensive testing we have decided to move away from using the Multilingual App Toolkit in both Climsoft and R-Instat due to the number of severe problems that we are having to write code to work around.

Instead we will:

[ ] Create a procedure that will be used during development to loop over all over the controls in the forms and...
- Extract the text to be translated (usually in the Text property)
- Hash the extracted text to provide a unique id and store this id in the control's tag property
- Add the tag and the original text to a new translations.resx resource file
[ ] Implement a method for getting initial automatic translations of the contents of translations.resx using Google Translate
[ ] Ensure that people helping with translations have an easy way to contribute changes
[ ] Test whether changes that are submitted at the same time are likely to cause merge conflicts
[ ] Create a procedure that updates all displayed text in response to the language being changed at run time

Translation by context

Where the same text exists twice, but the translation into another language is different because of the context, we can manually adjust the tag to have an additional suffix, e.g. b2f1ac-2. When we are automatically generating hashes we will have to check whether there is an existing hash that needs to retain its suffix (and not be overwriten)

Tag	Text	French	Portuguese
b2f1ac	Data Entry	La saisie des données	...
b2f1ac-2	Data Entry	Saisie de données	...

(The French phrases above are just provided as an example of alternative translations in French. They may not illustrate the idea of translation by context very well - unless the context in the second example is that there is not enough space to display the preferred translation)

Automatic removal and adding of non-alphanumeric characters

Where the same string exists twice, but with a small variation caused by non-translatable (non-alphabetic) characters at the end of the string, we will remove these, perform the translation and then add them again. This will ensure that the following only have to be translated once:

Enter value and Enter value:
Number 0 and Number 1

Currently we don't have a universal solution to achieve a similar result for text in the middle of a string, e.g. "There are 12 results", "There are 15 results", "There is 1 result". This would require something along the lines of ("There % % result%", "are", "12", "s").

Stings provided dynamically from code/database

We can translate strings that are updated at run-time by first running these through the hashing algorithm and then looking up the translation as usual.

We need to be able to mark all string that are hard-coded, and those that are retrieved from the database so that they can automatically be included for translation. In the code, this could be achieved by wrapping the strings in a dummy function t() that returns the same value that it is given. The strings could then be found through introspection of the code looking for the t() function. E.g. t('Text to translate').

To translate the same text differently depending on context we could add the required suffix as an optional argument e.g. t('Text to translate', '-2').

Where text is stored in the database, we will need a function that is aware of the tables and columns that text is taken from so that we can extract all values in these locations. In this case, we don't currently have a method to translate differently depending on context.

Metadata

In addition to the translations, we need to ensure that we store comparable metadata to the multilingual app toolkit to ensure that automatic translated as initially marked as needing review.

The multilingual app toolkit has New, Needs Review, Translated, Final. There may also be other options (it seems to be possible to "lock" translation so that they are not accidentally edited).

Editor

We need to choose an editor that people can use to easily contribute suggestions (preferably not an editor that requires us to convert back and forth to xml). In addition, we may want to crowd source translations using online services.

dannyparsons commented 5 years ago

Just a small point on text in the middle of a string, e.g. "There are 12 results", "There are 15 results", "There is 1 result". I think it's generally recommended to try and avoid this in software that will be translated. I have certainly seen that putting textbox controls in the middle of a sentence is advised against because in different languages the order of words are not the same. We could separate the words from the numbers e.g. "Number of results: 12". So hopefully this is something we don't have to worry too much about.

climsoft / Climsoft

Language translation #200

Translation by context

Automatic removal and adding of non-alphanumeric characters

Stings provided dynamically from code/database

Metadata

Editor