Closed sveseli closed 2 years ago
Could you try changing
to
A.strm<<'\"'<<pvd::escape(scalar->getAs<std::string>())<<'\"';
It may be necessary to include pv/typeCast.h
.
Converting this code to use the yajl_gen API would probably be the simplest way to solve this, it is supposed to properly handle UTF-8 escapes.
If I try pvd::escape(), here is what happens when I run tests that I added (see https://github.com/sveseli/pvDataCPP/commit/37d18cdcc548973c95e3712c01129e00373ad7d2):
bluegill2> ./testApp/O.linux-x86_64/testjson | tail -16 ok 29 - round2 ({"scalar": 42,"ivec": [1,2,3],"svec": ["one","two"],"sub": {"x": {"y": 43}},"extra": 0,"sarr": [{"a": 5,"b": 6},{"a": 7,"b": 8},{"a": 9,"b": 10}],"any": "4.2","almost": "hello"}) == "{\"scalar\": 42," "\"ivec\": [1,2,3]," "\"svec\": [\"one\",\"two\"]," "\"sub\": {" "\"x\": {" "\"y\": 43" "}}," "\"extra\": 0," "\"sarr\": [{\"a\": 5,\"b\": 6}," "{\"a\": 7,\"b\": 8}," "{\"a\": 9,\"b\": 10}]," "\"any\": \"4.2\"," "\"almost\": \"hello\"" "}" ({"scalar": 42,"ivec": [1,2,3],"svec": ["one","two"],"sub": {"x": {"y": 43}},"extra": 0,"sarr": [{"a": 5,"b": 6},{"a": 7,"b": 8},{"a": 9,"b": 10}],"any": "4.2","almost": "hello"}) testPrintJson: result={"foo": "bar"}; expected={"foo": "bar"}; comparison=1 ok 30 - compare == true testPrintJson: result={"foo": "\"bar\""}; expected={"foo": "\"bar\""}; comparison=1 ok 31 - compare == true testPrintJson: result={"foo": "\"long string with several \" characters\""}; expected={"foo": "\"long string with several \" characters\""}; comparison=1 ok 32 - compare == true testPrintJson: result={"foo": "\"long string with several \" and \' characters\""}; expected={"foo": "\"long string with several \" and ' characters\""}; comparison=0 not ok 33 - compare == true Results ======= Tests: 33 Passed: 32 = 96.97% Failed: 1 = 3.03% Skipped: 2 = 6.06%
If I try loading to json the string that fails the test, I get this:
>>> pv = PvObject({'foo': STRING}, {'foo': '"long string with several " and \' characters"'}) >>> s = pv.toJSON(True) >>> s '{\n "foo": "\\"long string with several \\" and \\\' characters\\""\n}' >>> d = json.loads(s) Traceback (most recent call last): File "", line 1, in File "/local/sveseli/PVAPY/TEST3/CONDA-3.9/lib/python3.9/json/__init__.py", line 346, in loads return _default_decoder.decode(s) File "/local/sveseli/PVAPY/TEST3/CONDA-3.9/lib/python3.9/json/decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/local/sveseli/PVAPY/TEST3/CONDA-3.9/lib/python3.9/json/decoder.py", line 353, in raw_decode obj, end = self.scan_once(s, idx) json.decoder.JSONDecodeError: Invalid \escape: line 2 column 44 (char 45) >>>
I wrote a simple function that just escapes a double quote (see https://github.com/sveseli/pvDataCPP/commit/b7e8833ee477a3f5a4ce59cae801b37f39144d9a), and which seems to work okay. If you'd like, I can make a pull request from my branch.
Converting this code to use the yajl_gen API ...
It doesn't look like this was/is present on 3.15, which has so far been supported.
I wrote a simple function that just escapes a double quote ...
Please test #79 which attempts to be a more complete solution.
It doesn't look like this was/is present on 3.15
yajl_gen.h
has been present in Base since before R3.15.0.1. If the header doesn't #define EPICS_YAJL_VERSION
though you're getting the older YAJL 1.0.12 API, which needs slightly different code to generate a yajl_gen handle
, and didn't support long long int
values.
On inspection your PR #79 looks like it should handle 7-bit ASCII characters properly (which is an improvement), but may generate mojibake for other characters depending on what isprint()
returns (locale-dependent). Assuming the tests eventually pass (GHA seems to be having issues with this job) I'd be okay with our merging this for now.
yajl_gen.h
has been present in Base since before R3.15.0.1. ...
My mistake. I replaced #79 with a quick attempt to use this, which will likely fail the 3.15 build. It also causes testjson
to throw up a valgrind warning. I can't immediately see what is going on, but assume I mustn't be using the yajl api correctly...
Conditional jump or move depends on uninitialised value(s)
at 0x483BC85: strlen (vg_replace_strmem.c:459)
by 0x4A35D77: yajl_gen_string (yajl_gen.c:272)
by 0x496596C: (anonymous namespace)::yg_string(yajl_gen_t*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (print.cpp:75)
by 0x4965A7B: (anonymous namespace)::show_struct((anonymous namespace)::args&, epics::pvData::PVStructure const*, epics::pvData::BitSet const*) (print.cpp:93)
by 0x4965FFA: (anonymous namespace)::show_field((anonymous namespace)::args&, epics::pvData::PVField const*, epics::pvData::BitSet const*) (print.cpp:163)
by 0x49667FE: epics::pvData::printJSON(std::ostream&, epics::pvData::PVField const&, epics::pvData::JSONPrintOptions const&) (print.cpp:270)
by 0x10EA73: printJSON (json.h:75)
by 0x10EA73: (anonymous namespace)::testroundtrip() (testjson.cpp:230)
by 0x10EEF6: main (testjson.cpp:285)
... but may generate mojibake ...
There is no "may". My first take would do so.
... I'd be okay with our merging this for now.
I'll fall back to the previous version if the yajl_gen usage issues take too long to sort out.
... I mustn't be using the yajl api correctly...
yajl_gen_indent_string
doesn't copy the provided char*
.
@sveseli Please re-test with #79
In retrospect, using yajl_gen.h wasn't worthwhile. If I had looked closer I would have noticed that it only emits single character escapes for values < 32
(aka space). I think this makes sense as the output is allowed to be utf-8, so there is no reason to escape anything other than ascii control characters. This would have been only a one-line change to my original (somewhat smaller) PR. Oh, well. I'm not motivated enough to revisit this now. I guess this serves as a reminder that I should always look more closely behind a new (to me) API before using it.
The printJSON() method results in invalid JSON if strings contain double quote characters (see https://github.com/epics-base/pvaPy/issues/74).