Open phet-steele opened 6 years ago
Also... If anyone is writing layout code that is relying on leading/trailing spaces in translated strings, that's a bad practice. If the translator doesn't provide identical leading/trailing spaces, your layout will be hosed. Similarly if the translator accidentally adds leading/trailing space, your layout will be hosed. And Rosetta currently provides no support for either scenario.
I see 3 places in string.js where .value
is being used to lookup the value of a string. Lines 63, 174, 181.
One possibility would be to put an assertion like
if ( parsedStrings[ key ] !== undefined ) {
var stringValue = window.phet.chipper.mapString( parsedStrings[ key ].value, stringTest );
assert && assert(stringValue.length===stringValue.trim().length,'untrimmed string: '+stringValue);
onload( stringValue );
}
I tested it on several sims and haven't seen a problem yet.
UPDATE: My point is that rather than gracefully handling trailing/leading whitespace we should identify it and flag it as an error so it can be corrected.
Identifying leading/trailing whitespace is useful for the English (fallback) strings, which we can then fix. But we currently have no control over the strings for other locales. So stripping leading/trailing whitespace is (imo) the robust solution.
We should add a trim
on the rosetta side (before strings are committed). Then to address existing trailing/leading whitespace we could either (a) strip whitespace on the sim side as proposed or (b) do one pass through all existing strings and fix any that are problematic.
Yes, we could (should?) add a trim in Rosetta, and make a pass through all existing strings. But we still need a trim at the point where the strings are read from the file. In general, anytime you read a string from an external source (user input, file,...), and that string is not supposed to have leading/trailing whitespace, one should defensively trim the string. And that's the case with the string plugin.
11/9/17 dev meeting: • Start by trimming in string plugin, @pixelzoom will handle • @jbphet will create an issue for Rosetta to trim there
The issue in rosetta for trimming strings provided by translators is https://github.com/phetsims/rosetta/issues/163.
Presumably this can be addressed by adding trim
immediately before any call to ChipperStringUtils.addDirectionalFormatting
.
I found 2 calls to ChipperStringUtils.addDirectionalFormatting
(string.js, getStringMap.js) and put a trim
immediately before each of them. All looks good in requirejs and built modes - I verified that leading/trailing whitespace is stripped.
But with ?strings=...
, I'm still seeing leading/trailing whitespace - which means there's no trim
and no ChipperStringUtils.addDirectionalFormatting
. Since Rosetta is using ?strings
, I'm wondering if it's adding the directional formatting. @jbphet?
If Rosetta is indeed adding directional formatting, then it will also need to trim
before doing so, since the directional formatting will surround any leading/trailing whitespace.
In the meantime, I'm going to trim
the values provided via query parameter, to handle the general use case.
Doing the trim in requirejs mode would be a reasonable change in string.js:
168 if ( queryParameterStringValue ) {
169 onload( queryParameterStringValue.trim() );
170 }
Handling it in built mode is way ugly. It would involve changing sim.hmtl line 68, like this:
60 var stringOverrides = JSON.parse( phet.chipper.queryParameters.strings || '{}' );
68 return stringOverrides[ key ].trim() || window.phet.chipper.mapString( window.phet.chipper.strings[ window.phet.chipper.locale ][ key ], stringTest );
Chatting about this with @jonathanolson on Slack, we're considering not trimming the values provided via ?strings
since (presumably?) it's for use by Rosetta and developers. But my concern with not trimming ?strings
is 2-fold: (1) it introduces a special case where string values are not trimmed, (2) ?strings
may be used by third-parties in the future and we might forget to trim.
A more general concern about this entire exercise... It's disconcerting how this responsibility is spread out, over multiple files (string.js, getStringMap.js, sim.html,.. others?), build tools, Rosetta,…Something as simple as trimming the string value should not be this difficult, or involve touching this many files.
Slack developer channel:
@pixelzoom: JB: It appears that strings provided via the “strings” query parameter do not have directional formatting added. Is that expected? Is Rosetta adding the formatting?
@jbphet I just tested an RTL language in rosetta using the test feature, and I can see that the strings look right, but I'd have to spend some time on it to see if the embedding marks are in the query param and, if so, where and how they are added in the code.
Assigning to @jbphet to determine where directional formatting is coming from when ?strings
is used with Rosetta, and whether Rosetta also needs to trim
.
@jbphet in Slack, re the 'strings' query parameter:
AFAIK, it's only used by rosetta and devs, and is not otherwise a public query parameter.
To maintain momentum on this, I poked around in rosetta myself. After searching for various things ('directional', 'RTL',...) I finally found 'rtl' in a couple of comments -- which I've changed to 'RTL' in the above commit.
In translate-sim.js, it does look like directional formatting is being added on line 84:
// add RTL embedding markers for RTL strings
if ( rtl ) {
translation = '\u202b' + translation + '\u202c';
}
else {
translation = '\u202a' + translation + '\u202c';
}
Changes that need to be made in Rosetta:
(1) Replace the above code with:
translation = ChipperStringUtils.addDirectionalFormatting( translation, rtl );
... so that (a) build process and Rosetta are using the same code to add directional formatting, and (b) it's easier to locate directional formatting in the future (as was the problem in this issue).
(2) Immediately before the above code, trim whitespace by adding:
// remove leading/trailing whitespace, see chipper#619. Do this before adding directional formatting.
translation = translation.trim();
(3) Verify that whitespace is trimmed when English strings are displayed. If that's not happening, make it so.
(4) Verify that whitespace is trimmed when translated strings are saved. If that's not happening, make it so.
@jbphet Do you agree?
I don't feel comfortable making Rosetta changes, since I'm not familiar with how to test and deploy. Discussed with @ariel-phet and he said to assign to @jbphet to (1) make Rosetta changes and (2) verify the changes that I made to build tools in https://github.com/phetsims/chipper/commit/f7be23c15c5fa8943c5bc14fb6d3016a1e806bf7. Let me know if I can help.
@jbphet Check with @ariel-phet on priority. Rosetta will currently show leading/trailing whitespace, while requirejs/built sims will trim it. So what the translator sees doesn't match what will be deployed.
Okay, I'll set the priority and hopefully address during the upcoming chipper:2.0 effort.
@samreid and I ran into this in regards to https://github.com/phetsims/chipper/issues/779 and (more generally https://github.com/phetsims/rosetta/issues/193). I think it would be good to work on this as part of that work. Assigning to myself for some investigation so I don't forget.
Identifying leading/trailing whitespace is useful for the English (fallback) strings, which we can then fix.
+1 for an assertion for english strings!
I decided to do this work on the a11ystring-plugin
branch because it has a bunch of string plugin work already done. I was able to make the assertion for english strings in the below patch. I didn't commit it though because in friction (the a11y strings example I converted into the en json), I have some trailing spaces in the a11y descriptions. That will be fixed in https://github.com/phetsims/friction/issues/182. Until then I will keep this issue assigned to me. Even if in the future we can't apply this patch perfectly, it will be a good starting point.
@jbphet I see https://github.com/phetsims/rosetta/issues/163, so I don't think you need to be assigned here. Let me know if that changes.
https://github.com/phetsims/friction/issues/182 sorta became an issue to do this conversion in more repos than just friction (oops). And so now there don't seem to be any trailing/leading whitespace in a11y strings, and after adding the assertion in, I don't see any in the en json files either. I feel good about committing this to the branch. We will know for certain when it is merged to master. I'll keep myself assigned until then.
After merging to master, I ran aqua tests and found one error (https://github.com/phetsims/masses-and-springs/issues/350). I'm going to close this now that it is on master.
So I just ran back into this over in https://github.com/phetsims/friction/issues/237. Even though I added the assertion, it is never run, because we aren't running things with assertNoWhitespace
.
When I apply this patch, I see quite a bit of issue in our strings printed out when building just friction. I worry about what it will mean to actually limit our trailing/leading whitespace. It seems like we either need to tighten down on this, and potentially change a lot of translations, or discover another way forward. Maybe we don't care about this that much anymore?
Reopening to discuss at developer meeting.
We discussed this at today's dev meeting:
SR: Should we add assertions that the english strings don't have leading/trailing strings? SR: Should rosetta trim strings before storing them? SR: If there are no leading/trailing strings in the english, does that mean we can safely strip them out of the translations? JG: Perhaps one translator is just in a habit of adding trailing whitespace, and ....? JB: Does the whitespace actually change the bounds? JO: It does slightly change the text rendering, so it changes the layout. JB: Yes, but it doesn't seem like a big deal, but would be nice to fix when there's time.
We agreed the pattern in https://github.com/phetsims/friction/commit/b62a8d8d6b6d5deeea75b91292352af7b29fccbf is what not to do. You should have a template pattern that inserts space between sentences, not relying on sentences to have their own trailing whitespace.
SR: Is this a rosetta problem, or more about templating a11y strings? JB: Perhaps it is the latter. I recommend requesting the a11y team to use more maintainable patterns for templating around whitespace. JG: That pattern is not widely used, but it should be improved.
We decided this is at least 3 problems:
Let's touch base with @zepumph, who was not present for this discussion.
We discussed this more and decided two things.
For the actual assertion mentioned in https://github.com/phetsims/chipper/issues/619#issuecomment-925040123, I will test all sims and see how many cases there are with white space. I will note them in https://github.com/phetsims/rosetta/issues/270.
I will report back to a subgroup of @jbphet, and @jonathanolson.
Thanks!
@pixelzoom has pointed this out. Using a string value like "____Basics" for the title of a screen will place the string wildly out of place. (github DOES strip spaces, so I had to use underscores to show spaces 😠)
@ariel-phet please assign for someone to address.