Closed nfm closed 11 years ago
My workaround for now is to add
foo.to_json.gsub(/\u2028/, '\u2028').gsub(/\u2029/, '\u2029')
This sounds like a bug in browsers, not in Rails...
According to this Stack Overflow answer, it's more like a JSON spec bug - JSON allows characters in string literals that are invalid in Javascript code.
The unicode characters in question are equivalent to something like this, which causes the same syntax error:
var foo = "this string
is multiline";
Obviously you need to use an escape sequence (\n
, \u000A
, etc) for the above to work.
I guess \u2028
and \u2029
are equivalent to the literal line break, which is why they're not valid in Javascript.
In any case, if you're taking user input, storing it, and then returning it as JSON, Javascript execution will break when it hits either of these characters, so it would be preferable not to output them!
I can roll my own fix for this, but this feels like it belongs in Rails, in the same way as other cases that would result in invalid output (eg "
gets escaped) are handled in ActiveSupport::JSON::Encoding
.
Do you think \u2028 and \u2029 fall under the same category?
@vbrendel tr
should be faster than chaining gsub
:)
@steveklabnik this is a regression. it works in rails 3.2.x. I upgraded one of my apps to rails 4 rc1 today and was bitten by this pretty severely.
@solnic I was affected by this issue on 3.2.13.
Thanks for the fix guys!
@nfm could you check 3-2-stable?
@rafaelfranca Yes, I can confirm that 3.2.13 is affected by this issue, and that it's fixed on 3-2-stable (see https://github.com/nfm/rails-json-u2028-test for a minimal test app).
I notice that e4ec944 was on master - was there another commit on 3-2-stable to fix this that I've missed?
@nfm thank you for confirm and for the test app.
Yes. There is https://github.com/rails/rails/commit/c910388587220e962682b0b9187e79b8f1641c17 that I made only to 3-2-stable
If a string contains either the unicode character \u2028 (line separator) or \u2029 (paragraph separator), calling
as_json
orto_json
on the string will output JSON that cannot be parsed as Javascript by the brower.Firefox will raise
SyntaxError: unterminated string literal
. Chrome will raiseSyntaxError: Unexpected token ILLEGAL
. I haven't tested Safari/IE/Opera.Apparently, these characters are valid in string literals in JSON and will be accepted by
JSON.parse
, but are treated as line endings by Javascript parsers.Here's a quick demo:
The following HTML will be output:
Firefox and Chrome will raise syntax errors.
Here's one workaround I've discovered:
escape_javascript
will encode \u2028 as '
'. It doesn't handle \u2029.Another workaround is to call
gsub
ortr
afterto_json
.I'm not sure what the best approach is to tackling this. Would it be appropriate to submit a patch that modified
String#as_json
to convert these characters to a\n
?I'm not confident on what the implications of this kind of change would be. There may well be other unicode characters that cause the same issue, too.
Related issues: