nlohmann / json

JSON for Modern C++
https://json.nlohmann.me
MIT License
42.97k stars 6.72k forks source link

The library can not parse JSON generate by Chome DevTools Protocol #3903

Closed goodguysoft closed 1 year ago

goodguysoft commented 1 year ago

Description

The library throws exception when I try to parse JSON message produced by Chrome browser via Chrome DevTools protocol. I have no idea whether Chrome breaks any standards or not, but Chrome is standard de-facto, so at least any flag that will make the library compatible with Chrome may be good idea. I also can visualize the JSON with Visual Studio, Notepad++ and some other tools, so this format is OK for other widely used tools.

Reproduction steps

I add the code sample (Visual Studio 2022 native unit test). I just call json::parse for some JSON file generated by Chrome browser.

Expected vs. actual results

I expect that if Chrome, Visual Studio, Notepad++ can parse the JSON, nlohmann::json should also supply a way to do so. You can download the file that contains target JSON here: data.json

Minimal code example

#include "pch.h"
#include "CppUnitTest.h"

#include <format>
#include <fstream>
#include <string>

#include <nlohmann/json.hpp>

using namespace Microsoft::VisualStudio::CppUnitTestFramework;

namespace JsonTest
{
    TEST_CLASS(JsonTest)
    {
    public:

        TEST_METHOD(Nlohmann)
        {
            using namespace nlohmann;
            using namespace std;
            try
            {
                fstream json_reader(R"__(c:\data.json)__");
                json json_data = json::parse(json_reader);
            }
            catch (const exception& error)
            {
                Logger::WriteMessage(format("Can not parse JSON file. {}", error.what()).c_str());
            }
        }

    };
}

Error messages

Can not parse JSON file. [json.exception.parse_error.101] parse error at line 1, column 7198: syntax error while parsing value - invalid string: surrogate U+D800..U+DBFF must be followed by U+DC00..U+DFFF; last read: '"\u001f\b\u0000\u0000\u0000\u0000\u0000\u0000\u0003s~EJvoCJ\n$A\u0011I)Y~2Ye4\u017cs';;O;yHT\u06f4m\u00f2\r{tk\r\u000ea97 8]\u0013\u000e!<32p]9\u0004Yw\u000eZN'w&FkwQL\u0005fx#k4\u00188=:\u001fvXc'-x,\b\\sl\u0006\r\u00181L\u000f${\u00159K\u0005ND\"QS\u0001$\u0007^\u428d-\"8,HBz\u0001\u001ai\u001f&\u001ag/xE\ucaae\u0006 9\"9X*-\u03d0x~MvLgy\u0225V-zz-d~2cI\u00a3\u001fiv~dDD\u0011~\u00073\u0539da1\"\u06c6\u000eT\u0019N3?c\u000b9-\u0006\u001b\u000f}r;0\u0016\u0329\u0007a16G`\u001c\fLn\u0006\u001b#\u00160gFccn\u001a~\u000f,%p]{i\u001f5-hK\u069c\u04b1\rGm'N\u0014\u0001\u0011i=6\u0002ao\u00028)<\u000ft\u0013sJ32g\u04f0#\u59dd\u03d4\u001dMFwyl|&2\u001a`A\u0006\u0010=\u0341\u0003Xp;>Fg\u00d1e;dd\u001fgp\rO@&L yw\u0007\u0002/\u053dtyr9\n[$:1r`\u0307\u001e\f\u0546<tg+\u0019\u0002h4e!,Y+e\u0002B~\u001d7L@P\u001dnGI\u001b\u0001j^\u001e\\\u000b~\u0003#\u0007op\rfr\u05a8gI0w\u0019J\u001e\u0751/61kR$4pesK}\u0016IH\"\u030d(\u04c8I}S?\u02dd:\u0003\\\"\n\u0015l\u0732PW \u001351, ?\u000b3_,_@\u001c\u0010r\u00109(eK\u0013\u0002)8K\u000fI]a\u0006|1?sjU%d+meYFQu\u02fb\"!5\u000fv\u000e.\u001aN\u05c4\u0012\u0010\u0014Rx\u03627`m\u0010=\u001f!zr}Y\u0004>-z\u00039O.\u0335|\u001f{K?=Z@\\N\nf54&D}0co1f\u00160b M\u0013v`,%!X_\u000bB=<\u001aB\u0014\f/]3\u0010e/\u0002\u001eda\u0011D\u0011E5\u0233D\u001e[$\u001aH$:\u001bF#\b\u0014\rB\u03c5\",A\u001e%T\u0002\u0004C\u000e}84sqB5\u000e\u01e9\u0013\u0012y\u001e[$HdA$C\u0003Y\u01e2\b\u0016?7\fY\n8gB\u001eZ\"jg=W{M`\"\u03f1;O\"aQ$\nr\u001f^|\u0006g_m8\"\u001bQ`5_> y\u000eI\u000b\u001e>o\u0004\u0002)J<isA1sRBL#\u001esKM'Su\u0010\u001a3q\u001f+\u012ex\u001eOx\u0016<\u001eA\u0166@\\$Qp\u001anT\u0001?F\u001cL?<FCnW+&WR~n\u0004\b)\u0011y\u0018ae\u0018LkCPB\u0013bL\u0017CB9\u0003?,1<Af\u001b^K&\u0002\u0018aL3\u0393G0q?<\u0000\u0006N\u0011/B\fe\u02db\"YX\u000fn\u000f\u0000\u00135cTt\u0018\u0018<$\u000fJ'\u0002w\u0267\u0120?j7\u0006T\u001d\u0007s\u001c\u001c\u0004Q\u0016b\u00060\b_:=--\u0019Y\u0018\u0000W\u0015C\u06547D\u0012zYsH\u0014L\u0011~^\u0015\u001etF<\b33\u0011mhT\u020d\u0000> a\u000e\u001di|2[ \u0013\u001a\u0011i`#\u000b1:\u0000\f$\u05bdxsirs\u0005W9\fL*\u07d82o\u001bb9\fh&r`\u28c8e\u0012+\f \u078b\u001a\"YA{XXJ\rDItSX2\u000by\u0006\u0014\u0012l\u0015\u00142@`p]4CE3\u0012  Dwq7o33\u02c4/[TYy\u0016n\u0014J\u0018#Bg4\u01fa.s\u0016S\bKaZ0\u0010y)h\bxh\b,\u0018$=\u0000\"yR\u0016\u000f\"\b`qlZ:Q\r\u0016\u000f)\u00042Y&'b\u001a\u0002Z\t2\u0012\u0214\u0000\u0011+F\u0000`6\\f,\u0006DQ@R\u00195_24|M#3#\u0003VK\u030aays\u00018\u0002uw!\u04cf[B8),MKL\u000bE\u0006CQ&\u001f\u0552\u001c\u0014\u001f6Rz\u0004K3J3TBz\u0015\u001a6\u0018\tVt\u001eOWd\\R71!\"X\\N^\u0014O\u000f\u0019\"\u0003\u0013syCq0<o\u0002\u0013\u0003X4(Y\u000e\u0013J\u0002\u0011nHA\u03a2\"N\u000f\u0013X>x&\u000bVe\u0014X\u000fY\u0002:< `fr\n\u0017v\u0000WQ\u001b?\u0001b\u00109E\u0000\u0006^\\7\b~+cj3g\u0014Rq\b~X7\u00163XlV~\u0003<\u000e\"CV7tSX\u0013\u0017\u0018'\u001c\fR\u0004sj\u0004\u001ea\u001aeF?\u0011zmKq0+^Z8\u000b]\u0017bNDVN?\u0019H\u0014s\u0005\u001bE\bbq#3\u00069\u0411/<\bb\u0017=R\u0006i.Ae$\u0003K,\u0005\u000f6\u001a,SnW\u001cpi\u0007{+\\#\u0016!_\u001a,|\u0011(tw\u0011wA\u0006qNEE9+\u071aPDE\u0756\u0720\rb\u0018pjC\u0019[Ck`)b=Ig\u0016 \u0152g\u0000_\u0014:\u0018,Y`T0z\u0398[qd\u000e\u0006NB%\u000f\u0000X\u0012,\u00ed@'I,\t3\bQ\u0013~\u001aJ\u0016X!7t\u0012%\u001am*S\u001fj\f\u0674N3Qiij\u0015\"\"\r\u0011q\u001c\u001cm3\b\u0006*\u0012J:B+f\u0010V\u0007a\u0005HS_.!\u0011mbH![\u001b\u0142gi955UE*\\%baq3t~!c,t5Ex2hV$\u0014S*`\u001bwr\b\u02e0Y\u0010\u0005\u0018\u0015sV_k^<\u00166M\u0002^cyK\u000f\u0017\u001emg\u0004\u0016F~\na\u06cbP\u00152<}\u0012x\tx\u015f\tX9\u000fS\u001d<5X<t!LD\u000e{\u0004:4A\u0293\u001bE1\u01dc5K.{=\u0000,X|pdXh\bvY\u0006S(@p\u0002\u02f6\u0001\u00032\u0016eHX\u0016E\u0440!\u0540+i9_50I\u0012&\u0088D\u0010#\u017c-1_rJ\u0018\u0018Q\u0019uSn19O0\bB:\f20+o\u0018qO\b.H'\u0011a\u0014\\\u0007Z\u0002#\t\u0735\u0007md\u0006M,%\u0011`R\u000f\u0003HHD4\u07390\u001dXc,\u01a9&$*D8$1X\\\u05f0\"@\u0016h4a3\u00d0~\u0001G\b\u0004eWJ~IXPLG+\u0001L[!\fh\u0000\u6ee2R\u0014\u0011\u0000\u607f{`\u026aYHN\u0005E\u0002s\n3RJ\u0014Ad+\u000e\u0011\u0017l\u0011`gIl7\u0015\u00129c\u0018ha\u062er9701\u0002O\u001dE\r@l1*me\u001c6\u0006I4B`\u0004Q)F\u001a\u0003\u00199\u0019\bFY\u000bF\u03ee\u0014G\u0001@\"\u0004k\u0000BG\u001d\u0004+S&T$1\u0014u\u0015P,\f9A\u000eQ\u0001\u0007~&V\r\"Z\u0004\u0015J4C]V \u000e&S\u0739\u0631-u!mM5O\n\u001bZE!\u031eW\u0241UD:` J\u000b\t8\"?%rM\u0004\u0002\u0001C<\u0005\u001c\u0198eA\u001d08\u0010\u0011\u0000\u000bJ8 \u001a*TRj;Wv\bD\u001a`\u04e9[`\u0013`?R\u7498&\nVVGs;sC\u00043\u001a\u0012\u0017,b\u0018Qn,eJ+=i<6v%{!^o\u0620F\t[J.\u0013~P\u001f\u000bJ,\u0002\u05f2]h\u000brkWnu\u0000\u001dF\u07c4\u001bjkTd\u0005lzsj\u000b\nW<D(\nDM\u0018cx\u000f\u0003-+tgp\u0001Y\u001eD).\u0772v/9UT\u0016|\u0013'\u0011.CA\u0017eb0#4\u0015\u001d\\\u0015ki\u042eMSa<j\u0002b)7Kg\u0001\u03da\u0016n.\u001e njq]>KXS=mg\bjs\u0616\u0014sj7\u0005?h\u4008]\u0015^\u0017@\"\u000eJ<p_gh\u04ba\u042d\u06b1!>Vr\u000bh\u5f0f\\o\"+g\u001f)!\u001a\u0018\u0015Qm\u06f4\rp;\u000fmO\u0269 p h\u00027V\feU\u0016\u00012\u0615lve+\u0010\u0018nVX\u0016\u04e36m}|'1J6\u03a9A 1J\u0165\u0016k*H8C\u0005*\u0003\u0012[Sf0}GuJpq\u00efQRI\u001cvT\u0330Y\u0006\u072a`\fZ5\u000eLVNY\u001eGj\u0007\u0018t\u0000Ll1;F[xam9@\u0014*i]YW\u0019L6\u0018t\u0003JAO:.\f}*Um6/\u0003Yn\u0003\\@<^\u0013;YJ\u0014lR8+=\u0011<6b\u0018Ta$:\f%MAE\u0469`.\"5\nw~$-\u001ej<Ss\nr\u0727\u0004%{y.i\fzG.Y\u0013m*\u00044euj\b7S'1\fUA\u0189'8p\u0002\u0018]:P\u19a8_z'ks\u000eX\u001f\u001a\u001eG\u06b85)\u032dnMtJgYm&R>`K\u01673Bm%\u0016\u0016\u001dLbc;\u0010nY-\u001a!\u0003\u0017x96q@\u04e8vd\nW\u0011$.:\u0006a\u06a5w\u5c3c[SljK}``X4#\u0006C,\u0002 g\u0272<\"4\u0690\nI6~#\u0010\u0010\u0541\u0004pVP;j\u00154=O\uf8ca\f\u0019SmbKc\u0000(\u079d\u000b|\b\u0001!NE\"\u00158[:C\u000b\u0000H\t\u056fqkp<r<u3MzS\u0466XjD\u0017[YG\u001f\u0010\u05de\\p3\\G9\u001d~HVwx8eYB]jhH/ \r`=IL<K\uc9b9\t\u416d/W[\u001cy\bZ*]d(`\u0013X\u0001YVDy\\\u0000\u0016\\5q7U\u0002^N\u0019z\u0016\u0000\u0005@\rDA2LSE\u0007\b\u0018\r\u0001P4\"7@L!l&\u0017i\"Q\u0012P`qT\u0000C`=9SlH\u00150m.:y.)(\u000fBm\ud81d\"'

Compiler and operating system

Visual Studio 2022, Windows 11 Pro.

Library version

3.11.2

Validation

nlohmann commented 1 year ago

The JSON structure is correct, but the payload in postData is invalid Unicode (hence the error message). The JSON specification (RFC 8259) assumes correct Unicode to ensure interoperability. The specification allows implementations to reject inputs otherwise. The library currently has no switch to ignore such an error.

nlohmann commented 1 year ago

(Are all your inputs illformed like this or is this an exception?)

goodguysoft commented 1 year ago

postData is actual post data created by website; it may be binary, and in Chrome DevTools Protocol they for any reason just encode it in such a way; may be base64 is better choice here, but in Google they decided to just escape binary data I suppose. So, wrong data may appear periodically dependently on website that I try to debug with Chrome DevTools. The only way how to fix it for now I see is to filter out postData value and later parse it manually somehow. May be any "incorrect string" callback that will contain the wrong string (postData value in such example) and allow to parse it with custom code is good idea, similar to current parser_callback_t callback, but as I see this callback doesn't allow to handle wrong data.