Closed GoogleCodeExporter closed 8 years ago
This patch contains proper bounds checking for surrogate pairs in the D800-DBFF
range, as opposed to just any two UTF chars together.
Test with:
1> c(mochijson2).
2> mochijson2:decode("{\"foo\":\"\\ud834\\udd1e\"}").
3> mochijson2:decode("{\"foo\":\"\\u0023\\u0101\"}").
Original comment by metrofindings@gmail.com
on 5 May 2009 at 9:40
Attachments:
I should have checked the issues list before embarking on fixing this myself!
I ended up with a shorter patch
that relies on xmerl_ucs to calculate the code point for the surrogate pair,
but is otherwise similar.
This issue prevents CouchDB from replicating documents containing unicode
outside the BMP, because encode()
escapes it as surrogate pairs, but decode() can't handle that format.
Original comment by adam.kocoloski@gmail.com
on 5 Jun 2009 at 3:08
Attachments:
applied in r108
Original comment by bob.ippo...@gmail.com
on 28 Sep 2009 at 7:12
Original issue reported on code.google.com by
metrofindings@gmail.com
on 5 May 2009 at 4:07