Closed hubertlepicki closed 4 years ago
According to the JSON specification all control characters need to be escaped inside strings.
@ericmj I did suspect that this is the case, yet we don't live in ideal world and often source of JSON may not entirely comply with the spec.
Poison seems to be handling that case and I was wondering if Jason shouldn't do it too.
In my very case it's the source of this broken JSON is Mailgun webhook callbacks, but I can't imagine they are the only ones doing that.
I would personally opt for following the spec to the letter. A preprocessing step can always escape control characters before passing it to Jason. It's also not hard to fork Jason into your own project to make some minor changes. :-)
The biggest risk I see with relaxing the spec here the fact that different parsers could produce different output. There are known security vulnerabilities caused by those differences in other areas (order of elements in objects), so I would be hesitant here without a good understanding that this can't cause similar issues. Especially that "newline-delimited JSON" is an existing format that relies on that property that "raw" newlines are invalid inside JSON.
Alright, ok I think it makes sense to close it then without fixing. As I said, I wasn't sure if that should be handled. Poison does, Jason does not.
Anyway I am sure someone in the future will stumble upon this issue and get their problem solved (or atleast understood) so not an entirely wasted effort.
Just an update in case someone stumbles upon this too: we have never solved the external service sending us invalid JSON. I don't have great solution either, but I came up with a hack that seems to work: a simple proxy which tries Jason and if exception happens retries using Poison. This makes 99% of our requests go to Jason, and the faulty 1% goes to Poison.
defmodule JasonWithPoisonFallback do
@moduledoc """
A JSON parser wrapper which *on decoding only* first tries Jason,
if it crashes falls back to Poison.
It uses Jason only for encoding.
"""
def decode(input, opts \\ []) do
case Jason.decode(input, opts) do
{:ok, value} -> {:ok, value}
_ -> Poison.decode(input, opts)
end
end
def decode!(input, opts \\ []) do
try do
Jason.decode!(input, opts)
catch
_kind, _error ->
Poison.decode!(input, opts)
end
end
defdelegate encode(term, opts \\ []), to: Jason
defdelegate encode!(term, opts \\ []), to: Jason
defdelegate encode_to_iodata(input, opts \\ []), to: Jason
defdelegate encode_to_iodata!(input, opts \\ []), to: Jason
end
and then set it up to be used by Phoenix:
config :phoenix, :json_library, JasonWithPoisonFallback
Sadly this won't work anymore with poison 4.x. Poison now also follows the spec.
I am using Jason 1.1.2 on Ubuntu Linux, and this is the behavior that works:
Yet, if a message I receive contains a newline, suddenly the decoding breaks:
Switching to Poison fixes the issue:
I think Jason expects the newline to be additionally escaped, because this works the same in Jason and Poison:
It may very well be that the newline character has to be escaped in JSON strings but that's not what I am getting from external source.
Is this a bug in Jason or I need to look for a workaround outside?