JuliaIO / JSON.jl

JSON parsing and printing
Other
313 stars 101 forks source link

Parsing a dubious JavaScript/JSON file #292

Open cormullion opened 5 years ago

cormullion commented 5 years ago

I was deep in a rabbit hole trying to parse a JavaScript file that defined a JSON object. I found an easy way to reproduce it on a mac:

using LazyJSON, JSON
s = read("/Applications/Julia-1.1.app/Contents/Resources/julia/share/doc/julia/html/en/search_index.js", String)
s = replace(s, "var documenterSearchIndex =" => "" )
jsoncontents1 = LazyJSON.parse(s)
jsoncontents2 = JSON.parse(s)

JSON.jl complains about a backslashed ':

ERROR: LoadError: Invalid escape sequence
Line: 15
Around: ...se Julia\'s compiler is differ...

I thought at first that this may well be the correct thing to do. But when I tried LazyJSON, it parsed without errors, which made me think that it might be OK. I spent a few minutes searching the internet but couldn't work out whether a \' was OK or not...

So I'll let you decide if this is a bug! :)

kmsquire commented 5 years ago

Strictly speaking, this isn't valid JSON--escape (\) can only be followed by a specific set of characters (see http://json.org/).

Whether or not we want to support passing this anyway I'd another question. I would be open to it, perhaps with an additional keyword argument (e.g., strict=false)

(You also might want to check out JSON3.jl)

cormullion commented 5 years ago

Thanks! A less strict option is always nice for the users, particularly when working on the fringes of the standards (talk to anyone working with Markdown...!).

I get a similar-ish problem with JSON3.jl (albeit with a different file):

 JSON3.read(filecontents)
ERROR: ArgumentError: invalid JSON at byte position 631241 while parsing type Any: InvalidChar
ntegrator.qold\nend"
},

]}

so I'll stay with LazyJSON for working with my malformed stuff, for now.