unfoldingWord-dev / ts-android

A tool to translate Bible stories into your own language
http://ufw.io/ts
Other
36 stars 19 forks source link

Add Frame Length Check #419

Open jag3773 opened 9 years ago

jag3773 commented 9 years ago

Based on the statistics that @richmahn put together at http://themahn.com/language_comp.php?s=en&t=ALL we ought to be able to give a warning to translators when their translation is either 50% shorter or 50% longer than their source text.

@neutrinog This should probably go into the pre-upload check when the completed toggle is on.

richmahn commented 9 years ago

Should it be 50% shorter or longer, or maybe if it varies more than 20% than the median of all frames? For example, comparing French to English (http://themahn.com/language_comp.php?s=en&t=fr) most frames are between 105% to 115% longer than their English counterpart. But you will see the shortest frame of all frames (Frame Low) is at 89.47%. This frame is 12-08, and the English is:

The Israelites marched through the sea on dry ground with a wall of water on either side of them.

The French is:

Les Israélites ont marché sur la terre sèche, entre les murs d'eaux.

which translates to:

The Israelites walked on dry land, between the walls of water.

Seems like it is lacking a bit of detail.

richmahn commented 9 years ago

Using my suggestion to determine if a frame is unusually under or above the normal (median) different between source and target, either 20% above or below, I have altered http://themahn.com/language_comp.php to alert us of such frames. Headings will be red if a frame in their group has a problem, and the toggle button to find that frame will also have a pinkish background. Drilling down will eventually find all such frames.

richmahn commented 9 years ago

Again, not sure if that is the most scientific method, and the frames that have problems were in finding the median variation, but should still seem to work. Again, this just would serve as a warning/notice to the translator.

da1nerd commented 9 years ago

@HannaJaicks all of this is located in door43.translationstudio.uploadwizard. The actual validation occurs in ReviewFragment.java. You'll notice this is using an AsyncTask to complete the validation. We'll eventually want to migrate all of the validation into a new subclass of ManagedTask for the TaskManager. However, feel free to just add in the new validation first if you want. We can revisit this after you are more familiar with the code.

da1nerd commented 9 years ago

Note: the upload manager has seen a lot of updates recently. This validation should be placed in com.door43.translationstudio.tasks.ValidateTranslationTask

PhotoNomad0 commented 8 years ago

@richmahn - Cannot open the web page anymore. I'm wondering if this is valid for asian languages also? I think it may have to based on words rather than characters since some languages are more terse than others. Thinking of Chinese where one character may be multiple words. Or Japanese where there are three different character sets for the language and may be interspersed somewhat.