vega / vega-lite

A concise grammar of interactive graphics, built on Vega.
https://vega.github.io/vega-lite/
BSD 3-Clause "New" or "Revised" License
4.64k stars 602 forks source link

Support `Infinity`, `NaN` values in CLI tools, editor #9026

Open shcheklein opened 1 year ago

shcheklein commented 1 year ago

For now just a suggestion. If it makes sense and there are no better alternative I can do a PR for this (hopefully).

We hit this issue https://github.com/iterative/example-repos-dev/issues/225 with JSONs containing non JSON.parse compatible Infinity. It's being produced by Python code, and is compatible with JS, but unfortunately not JSON.parse. It's a well known issue - https://stackoverflow.com/questions/1423081/json-left-out-infinity-and-nan-json-status-in-ecmascript

The proposed way to handle this (and we do this already in our tools where we need to serialize and parse back Inf, NaN, etc) is to use JSON5 instead of JSON.parse. It has zero dependencies, being used in many popular products and would allow Vega CLI (and some other parts - e.g. editor?) to be compatible with any JS-compatible JSONs.

Open to any other suggestions / ideas on how to serialize Vega into JSON that can be then handled properly by the CLI tool, for example.

domoritz commented 1 year ago

Related issues: https://github.com/vega/editor/issues/171 and https://github.com/vega/vega/issues/1148.

The Vega editor already support JSONC (JSON + Comments).

One issue we have come across is that there are many standard besides JSON: HJSON, JSONC, JSON5, etc as well as other formats like yaml, toml, ... and I haven't kept up with which ones are gaining wide support. We definitely don't want something that is not JSON (mainly for pedagogical reasons and it's easy to convert into JSON and then pass to Vega/VL) but I think your suggestion to make sure our interface is flexible enough is good. JSON5 seems like a good one but I'd like to see a comprehensive (with regards to relevant Vega features) comparison of various formats/libraries to make sure we pick a sustainable version. In particular, we need to make sure the JSON standard works with the Monaco editor component we use in the Vega editor.

shcheklein commented 1 year ago

The Vega editor already support JSONC (JSON + Comments).

It doesn't support though the same JSON that Vega within HTML renders just fine (Infinity, etc) AFAIU (?).

Related issues: https://github.com/vega/editor/issues/171 and https://github.com/vega/vega/issues/1148.

I think those two go way beyond this very specific and narrow issue tbh, but may be I'm missing something. In this case we need to find a way to "consume" what Python outputs (sklearn):

If allow_nan is false (default: True), then it will be a ValueError to serialize out of range float values (nan, inf, -inf) in strict compliance of the JSON specification. If allow_nan is true, their JavaScript equivalents (NaN, Infinity, -Infinity) will be used.

They call it "JavaScript equivalents". The way I understand it- it means that if you put that JSON as a JS object it would work. So, it's not some third-party format outside of the JS stack AFAIU. JSON5 is just a tool that can read it. Another (more dangerous way would be to eval).

JSON5 itself is a format that:

... that expands its syntax to include some productions from ECMAScript 5.1 (ES5). It's also a subset of ES5, so valid JSON5 files will always be valid ES5.*

So, AFAIU it's just a subset of JS and superset of JSON. It doesn't go beyond JS.

Anyways, no matter what is decided if we have a CLI tools that consumes JSON and renders it we need a way to pass Inf, NaN. It feels natural to do this in JS-style. Something that browser already "understands" and why it already works outside CLI (when for example we directly serialize it into HTML in DVC, it works just fine with Infinity since browsers "understand" it AFAIU).

Btw, no rush on our end, I put a workaround in that CI action that was using vega CLI tools, and we don't use it outside that. I'm not sure how those CLI Vega tools popular.

domoritz commented 1 year ago

We have to distinguish JS objects (which is what the JS APIs take) and JSON (serialized JS objects; with some caveats as you mentioned). I totally agree that it would be great if the APIs that take serialized JSON specs support the full gamut of what the JS APIs support.

The only thing I am worried about are inconsistencies between different interfaces (JS API, CLI, editor, etc). But as you noticed we already have an inconsistency between the JS API and the CLI. If we can somehow get the editor to also support JSON5, I would be 100% happy. I think, though, we could start with the CLIs for Vega and Vega-Lite. Could you send a pull request?

Paging @jheer, @arvind, and @kanitw. I think switching to json5 makes sense to have a consistent API between the CLI and the JS API. Do you agree and are okay with using json5 for parsing rather than native JSON?