computerline1z / okapi

Automatically exported from code.google.com/p/okapi
0 stars 0 forks source link

JSONFilter should use the JSONEncoder #373

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
The JSONFilter class uses the DefaultEncoder not the JSONEncoder.
The JSONEncoder seems to be a copy of the PropertiesEncoder currently.
We need to add encoding tests and update the JSON filter to do the proper 
escaping.

Original issue reported on code.google.com by yves.sav...@gmail.com on 23 Oct 2013 at 12:25

GoogleCodeExporter commented 9 years ago
The JSON Filter currently extract the string as-it: it doesn't try to un-escape 
any escape sequence. And conversely it doesn't try to escape any special 
character when merging since they are supposed to be already escaped. Hence the 
use the DefaultEncoder.

Original comment by yves.sav...@gmail.com on 23 Oct 2013 at 3:23

GoogleCodeExporter commented 9 years ago
Correction: JSONFilter internals are parsing the string and un-escaping 
properly, but, because we don't escape on merge, we re-escape in the extracted 
string (except for \uHHHH).

Original comment by yves.sav...@gmail.com on 23 Oct 2013 at 3:38

GoogleCodeExporter commented 9 years ago
I'm tackling this as part of issue 377, and actually you were right the first 
time--JSONFilter was recognizing \n, \r, \", etc. but then actually keeping 
them escaped.

For use with subfilters it is desirable to unescape on parsing, and then 
re-escape on merging, via the JSONEncoder. I have made these changes on dev.

Original comment by aa...@madlon-kay.com on 28 Nov 2013 at 1:04

GoogleCodeExporter commented 9 years ago
Yves, take a look at Aaron's changes when you have a chance - I think we can 
probably close this.

Original comment by tingley on 29 Nov 2013 at 10:21

GoogleCodeExporter commented 9 years ago
Looks OK.

We just need to make sure users know the change breaks backward compatibility 
with text previously extracted. For example we can't merge with the new version 
a file extracted with the old, we would get double escaping.
A note in the Changes log would be enough probably.

Original comment by yves.sav...@gmail.com on 2 Dec 2013 at 3:39

GoogleCodeExporter commented 9 years ago
The compatibility issue is now noted in changes.html.

Original comment by aa...@madlon-kay.com on 2 Dec 2013 at 11:16

GoogleCodeExporter commented 9 years ago
Just a follow up note:
Sergei rightly reported a side effect in XLIFF, caused by the change: Some 
characters are now invalid in XLIFF. For example JSON \b is not valid as a 
literal character in XML. And there is no official ways to deal with invalid 
XML characters in XLIFF 1.2.

But this is an XML/XLIFF issue.

Original comment by yves.sav...@gmail.com on 11 Feb 2014 at 6:23

GoogleCodeExporter commented 9 years ago
Re-opening this: Need to test the filter sets the TU to preserve the spaces if 
needed since now sequences like \n\n\n are not escaped any more.

Original comment by yves.sav...@gmail.com on 16 Feb 2014 at 4:18

GoogleCodeExporter commented 9 years ago
Closing this issue: Jim's refactoring has now the entries set to 
xml:space='preserve'

Original comment by yves.sav...@gmail.com on 1 Apr 2014 at 2:55