michalmuskala / jason

A blazing fast JSON parser and generator in pure Elixir.
Other
1.58k stars 168 forks source link

Issue decoding particular string #160

Closed Hajto closed 11 months ago

Hajto commented 1 year ago

Elixir: 1.13.4 (compiled with Erlang/OTP 24) Jason: 1.4.0

String:

{\n  \"name\": \"Elegant Ocean Racers #4970\",\n  \"description\": \"Elegant Ocean Racers are a collection of 5005 elephant & sea creature combinations, inhabiting the Polygon blockchain. Your Elegant Ocean Racer grants you access to the Elegant Ocean Race League.\",\n  \"image\": \"ipfs://QmRhriyxYQVmm4pUFp3WaFwNn99hz15x5SoLWVR5aXUDee/4970.png\",\n  \"dna\": \"77630fb67bbee3bcbe54fdb4d8e0cc3a155fada9\",\n  \"edition\": 4970,\n  \"date\": 1653213448373,\n  \"artist\": \"Tosh\",\n  \"attributes\": [\n    {\n      \"trait_type\": \"Background\",\n      \"value\": \"Gold\"\n    },\n    {\n      \"trait_type\": \"Creature\",\n      \"value\": \"EOR Phantom\"\n    },\n    {\n      \"trait_type\": \"Phant Skin\",\n      \"value\": \"Gold\"\n    },\n    {\n      \"trait_type\": \"Eye Accessory\",\n      \"value\": \"Sun Shades (Gold)\"\n    },\n    {\n      \"trait_type\": \"Headwear\",\n      \"value\": \"Bowler\"\n    },\n    {\n      \"trait_type\": \"Mouth Accessory\",\n      \"value\": \"Smile\"\n    },\n    {\n      \"trait_type\": \"Tusk\",\n      \"value\": \"Gold\"\n    },\n    {\n      \"trait_type\": \"Accessory 2\",\n      \"value\": \"None\"\n    },\n    {\n      \"trait_type\": \"Accessory 1\",\n      \"value\": \"Anchor (Gold)\"\n    },\n    {\n      \"trait_type\": \"Tier\",\n      \"value\": \"Legendary\"\n    }\n  ],\n  \"compiler\": \"HashLips Art Engine\"\n}"

It throws an error:

%Jason.DecodeError{data: "{\n  \"name\": \"Elegant Ocean Racers #4970\",\n  \"description\": \"Elegant Ocean Racers are a collection of 5005 elephant & sea creature combinations, inhabiting the Polygon blockchain. Your Elegant Ocean Racer grants you access to the Elegant Ocean Race League.\",\n  \"image\": \"ipfs://QmRhriyxYQVmm4pUFp3WaFwNn99hz15x5SoLWVR5aXUDee/4970.png\",\n  \"dna\": \"77630fb67bbee3bcbe54fdb4d8e0cc3a155fada9\",\n  \"edition\": 4970,\n  \"date\": 1653213448373,\n  \"artist\": \"Tosh\",\n  \"attributes\": [\n    {\n      \"trait_type\": \"Background\",\n      \"value\": \"Gold\"\n    },\n    {\n      \"trait_type\": \"Creature\",\n      \"value\": \"EOR Phantom\"\n    },\n    {\n      \"trait_type\": \"Phant Skin\",\n      \"value\": \"Gold\"\n    },\n    {\n      \"trait_type\": \"Eye Accessory\",\n      \"value\": \"Sun Shades (Gold)\"\n    },\n    {\n      \"trait_type\": \"Headwear\",\n      \"value\": \"Bowler\"\n    },\n    {\n      \"trait_type\": \"Mouth Accessory\",\n      \"value\": \"Smile\"\n    },\n    {\n      \"trait_type\": \"Tusk\",\n      \"value\": \"Gold\"\n    },\n    {\n      \"trait_type\": \"Accessory 2\",\n      \"value\": \"None\"\n    },\n    {\n      \"trait_type\": \"Accessory 1\",\n      \"value\": \"Anchor (Gold)\"\n    },\n    {\n      \"trait_type\": \"Tier\",\n      \"value\": \"Legendary\"\n    }\n  ],\n  \"compiler\": \"HashLips Art Engine\"\n}", position: 476, token: nil}

Weird thing is that if I copy and paste the string from the error it actually decodes...

https://user-images.githubusercontent.com/8336429/208484737-6f2bcf49-1254-4152-8ffa-6570ae712583.mov

Hajto commented 1 year ago

Passing it through String.normalize(&1, :nfkd) helps.

Hajto commented 1 year ago

I would really love to understand the reasoning behind this error. It is really curious.

ludwikbukowski commented 1 year ago

have similar issue @Hajto

voughtdq commented 11 months ago

You have a nonbreaking space:

iex> that_string = "{\n  \"name\": \"Elegant Ocean Racers #4970\",\n  \"description\": \"Elegant Ocean Racers are a collection of 5005 elephant & sea creature combinations, inhabiting the Polygon blockchain. Your Elegant Ocean Racer grants you access to the Elegant Ocean Race League.\",\n  \"image\": \"ipfs://QmRhriyxYQVmm4pUFp3WaFwNn99hz15x5SoLWVR5aXUDee/4970.png\",\n  \"dna\": \"77630fb67bbee3bcbe54fdb4d8e0cc3a155fada9\",\n  \"edition\": 4970,\n  \"date\": 1653213448373,\n  \"artist\": \"Tosh\",\n  \"attributes\": [\n    {\n      \"trait_type\": \"Background\",\n      \"value\": \"Gold\"\n    },\n    {\n      \"trait_type\": \"Creature\",\n      \"value\": \"EOR Phantom\"\n    },\n    {\n      \"trait_type\": \"Phant Skin\",\n      \"value\": \"Gold\"\n    },\n    {\n      \"trait_type\": \"Eye Accessory\",\n      \"value\": \"Sun Shades (Gold)\"\n    },\n    {\n      \"trait_type\": \"Headwear\",\n      \"value\": \"Bowler\"\n    },\n    {\n      \"trait_type\": \"Mouth Accessory\",\n      \"value\": \"Smile\"\n    },\n    {\n      \"trait_type\": \"Tusk\",\n      \"value\": \"Gold\"\n    },\n    {\n      \"trait_type\": \"Accessory 2\",\n      \"value\": \"None\"\n    },\n    {\n      \"trait_type\": \"Accessory 1\",\n      \"value\": \"Anchor (Gold)\"\n    },\n    {\n      \"trait_type\": \"Tier\",\n      \"value\": \"Legendary\"\n    }\n  ],\n  \"compiler\": \"HashLips Art Engine\"\n}"
iex> String.to_charlist(that_string)
[123, 10, 160, ...] 

Char [160] is not valid per the JSON spec. Replace it with a valid whitespace character - one of [9, 10, 13, 32].

If you replace the character, it works:

iex> Jason.decode(String.replace(that_string, <<160>>, " "))
{:ok,
 %{
   "artist" => "Tosh",
...}}

However, you introduce a new problem, which is that within your JSON values, there could also be a <<160>> somewhere. So some modification to whatever is producing that JSON would be the best solution.

@michalmuskala this can probably be closed, right? It doesn't seem like an issue with Jason itself, but with whatever is producing the malformed JSON.

michalmuskala commented 11 months ago

Ah, great catch. Yes, I'm going to close this. Jason only parses valid JSON as per the spec.