Closed JonjonHays closed 3 years ago
Erg. I'd classify this as a major issue. I'm gonna do a quick whip-around of low-level javascript experts and see what they think
The standard approach for dealing with custom values in serialized JSON-like blobs in JavaScript is to use revivers. For example, supposing that Python inserts NaN
and Infinity
into a JSON blob, thus generating something like
var str = '{"a":NaN,"b":Infinity,"c":[NaN,Infinity,-Infinity]}';
then you'd create the following reviver function
function reviver( key, value ) {
if ( value === '***NaN***' ) {
return NaN;
}
if ( value === '***Infinity***' ) {
return Infinity;
}
if ( value === '***-Infinity***' ) {
return -Infinity;
}
return value;
}
which can then be applied as follows:
var tmp = str.replace( /(NaN|-?Infinity)/g, '***$1***' );
var obj = JSON.parse( tmp, reviver );
// returns { a: NaN, b: Infinity, c: [ NaN, Infinity, -Infinity ] }
thus, behaving as expected.
Two caveats:
NaN
and Infinity
in places other than values, such as in a string of text, you'll need to improve the regular expression provided to replace
.The other alternative is to use a custom JavaScript JSON parser which can handle NaN
and Infinity
, which is straightforward based on Crockford's work.
Hope this helps.
One additional addendum:
One can avoid the string replace
if the serialization code properly encodes special values, such as NaN
and Infinity
. For example, in Python, one can provide a parse_constant
callback.
Doing this work upon serialization, while not obviating the need for a custom reviver
, should improve performance upon parsing the blob in JavaScript and circumvents tricky edge cases such as character sequences appearing in places other than as JSON values (e.g., such as in the middle of a string).
Thanks @kgryte for the detailed info.
I feel like I missed a beat somewhere. If we ignore the possibility of NaN
or Infinity
occurring within normal strings (which I think will be safe for this particular application), why can't the reviver
just be:
function reviver( key, value ) {
if ( value === 'NaN' ) {
return NaN;
}
if ( value === 'Infinity' ) {
return Infinity;
}
if ( value === '-Infinity' ) {
return -Infinity;
}
return value;
}
?
Or is this exactly what you were referring to in your addendum?
Because value
arguments are deserialized values. If you attempt to parse a JSON-like string with bare NaN
and Infinity
values, you should get a parse error if a parser conforms to the JSON-specification. E.g.,
JSON.parse('{"a":NaN,"b":Infinity}')
// throws SyntaxError: Unexpected token N in JSON at position 5
If the serialization logic happens to encode NaN
and Infinity
as string values, e.g.,
var str = '{"a":"NaN","b":"Infinity"}';
then, yes, you can simplify the reviver
implementation to what you describe.
I chose to replace with '***$1***'
, but this could be anything to serve as a sentinel, including just '$1'
.
This issue should be handled by the work in #41, so I'm closing this
Currently the backend server extension is using Python's native JSON package to serialize the payload. Python's JSON library will produce invalid JSON in the presence of NaN or Inf values. The JSON library will add values equivalent to Javascript's NaN and Inf values, but unfortunately these values are not considered valid JSON, and we fail on the frontend when parsing the JSON payload. While this could be handled on the frontend, I think it makes more sense to handle this server-side. Unfortunately, I wasn't able to find a solution for this without bringing in a third party dependency. The
simplejson
library provides anignore_nan
flag as input to itsdump
function that encodes NaN values asNone
/null
. While Javascript data rendering and plotting libraries typically prefer null values, Python libraries (e.g, Matplotlib) tend to prefer NaN values, and may produce errors in the presence of null values. Is it likely this extension will ever be called from a non-Javascript context? If not, it might make sense to simply encode NaN values as null by default. However, if it is likely that this extension will be called from a non-Javascript context, we could add an ignoreNan flag to the query parameters. In my PR for n-dim support, I added an implementation of the former. Here's a discussion on this topic:https://stackoverflow.com/questions/15228651/how-to-parse-json-string-containing-nan-in-node-js