googlefonts / aosp-test-texts

Testing texts for font development, derived from UI text runs in Android source code
Apache License 2.0
4 stars 2 forks source link

Localized strings appear to include `"` #5

Open chrissimpkins opened 3 years ago

chrissimpkins commented 3 years ago

It looks like we are parsing extra double quotes in the JSON keys.

Examples of some of the localized strings:

"\"\" adlı istifadəçiyə  fayl göndərilir"
"\" ከበደ ሚካኤል ቤትደውል....\""
"\"\" kişisinin engellemesi kaldırılacak."

It looks like this is may be defined @ https://github.com/googlefonts/aosp-test-texts/blob/30aa5072c0737181041a20709c700c7f25dc7fea/src/extract_strings.py#L543-L553 ?

@belluzj @madig Are these double quotes that are set in UI text? If so, why do some have empty contents?

madig commented 3 years ago

I honestly don't know if that stuff actually appears in the UI... I'm not familiar with how Android app localization works. Can you please reach out to someone at Google and ask them how these quotes and other placeholders are (to be) treated?

belluzj commented 3 years ago

The quotes in the first string are intended in the XML file, what's missing is the RECIPIENT interpolated text inside the quotes. Sometimes the <xliff:g></xliff:g> elements provide example data that I use in the strings dump, but sometimes (as in this example) they don't, so I remove the <xliff:g> element but leave nothing instead, leading to the two empty quotes.

From the JSON dump you can know in which XML file to look for the original string by opening the Git repository of the matching app (here: Bluetooth) and then looking for the folder res/values-{language_code}/strings.xml:

image

https://android.googlesource.com/platform/packages/apps/Bluetooth/+/refs/heads/master/res/values-az/strings.xml#83

Same story for your last example: https://android.googlesource.com/platform/packages/apps/IM/+/refs/heads/master/res/values-tr/strings.xml#56

For the middle example, it also looks like the quotes are intended in the XML, because that line with quotes around it is deliberately enclosed in another set of quotes in the XML, so that's an effective double level of quoting, of which I'm only unquoting one level. As Nikolaus says, I'm not sure whether then some other part of the Android framework is unquoting a second time before the strings reach the UI.

image

https://android.googlesource.com/platform/packages/apps/VoiceDialer/+/refs/heads/master/res/values-am/strings.xml#22

belluzj commented 3 years ago

About the middle example, it looks like the quotes are intended to be displayed, because the contents is a list of example sentences that the user can pronounce:

image

from https://android.googlesource.com/platform/packages/apps/VoiceDialer/+/refs/heads/master/res/values/strings.xml#34

belluzj commented 3 years ago

About examples 1 and 3, here is an example of a <xliff:g> that provides an example for interpolation:

image

From https://android.googlesource.com/platform/packages/apps/Contacts/+/refs/heads/master/res/values/strings.xml#402

In such a case, the export script will use the provided example value (e.g. Gmail):

image

chrissimpkins commented 3 years ago

Ty!

Hmm. I wonder if we should simply exclude all of those example fields. Reaching out to Pixel eng to see if they have any input on how those fields in the manifest files are used. Will let you know what I hear.

chrissimpkins commented 3 years ago

Pixel eng linked me to these docs on their formatting strings: https://developer.android.com/guide/topics/resources/string-resource#formatting-strings

We can either (1) swap in a properly localized target string (hard); (2) ignore the template text altogether and just take the localized string literal parts (easy). (2) is probably sufficient for our purposes here.

I think that we can also ignore all of the extra double quotes in the strings. I'm not sure that it is informative for a type designer who sets WIP designs in the localized strings. Here are docs that seem to get into why all of these double quotes are around: https://developer.android.com/guide/topics/resources/string-resource#escaping_quotes

Note: From XML parser's perspective, there is no difference between "Test this" and "Test this" whatsoever. Both forms will not show any quotes but trigger Android whitespace-preserving quoting (that will have no practical effect in this case).

It looks like this is to preserve whitespace in the string literals? Must be some concatenation happening so layout is preserved across fields?