vega / vl-convert

Utilities for converting Vega-Lite specs from the command line and Python
BSD 3-Clause "New" or "Revised" License
96 stars 12 forks source link

`vl-convert-python` does not support built-in Vega datasets #43

Closed nicolaskruchten closed 11 months ago

nicolaskruchten commented 1 year ago

Here's a little test case... the first one works and the other two don't:

from vl_convert import vegalite_to_svg
from urllib.request import urlopen
import json
prefix = "https://raw.githubusercontent.com/vega/vega-lite/next/examples/specs/"
files = ["arc_donut.vl.json", "point_binned_color.vl.json", "line_color_binned.vl.json"]
for f in files:
  spec = json.loads(urlopen(prefix + f).read())
  vegalite_to_svg(spec)

I'm on MacOS 11.6 on Apple Silicon, Python 3.10 installed by mamba

jonmmease commented 1 year ago

Here is the failing spec:

{
  "$schema": "https://vega.github.io/schema/vega-lite/v5.json",
  "description": "A scatterplot showing horsepower and miles per gallons with binned acceleration on color.",
  "data": {"url": "data/cars.json"},
  "mark": "point",
  "encoding": {
    "x": {"field": "Horsepower", "type": "quantitative"},
    "y": {"field": "Miles_per_Gallon", "type": "quantitative"},
    "color": {"bin": true, "field": "Acceleration"}
  }
}
ERROR RangeError: Invalid array length
    at ya (https://cdn.skypack.dev/-/vega-scale@v7.2.0-V0J4hKWZmxW22KwUY6dZ/dist=es2020,mode=imports,min/optimized/vega-scale.js:1:5686)
    at yt (https://cdn.skypack.dev/-/vega-encode@v4.9.0-DTdZszvC4Sts6FFZSAIL/dist=es2020,mode=imports,min/optimized/vega-encode.js:1:11413)
    at vt (https://cdn.skypack.dev/-/vega-encode@v4.9.0-DTdZszvC4Sts6FFZSAIL/dist=es2020,mode=imports,min/optimized/vega-encode.js:1:10500)
    at ie.transform (https://cdn.skypack.dev/-/vega-encode@v4.9.0-DTdZszvC4Sts6FFZSAIL/dist=es2020,mode=imports,min/optimized/vega-encode.js:1:8477)
    at ie.evaluate (https://cdn.skypack.dev/-/vega-dataflow@v5.7.4-DrCzG6Luqf74SfPN5Hxw/dist=es2020,mode=imports,min/optimized/vega-dataflow.js:1:15456)
    at ie.run (https://cdn.skypack.dev/-/vega-dataflow@v5.7.4-DrCzG6Luqf74SfPN5Hxw/dist=es2020,mode=imports,min/optimized/vega-dataflow.js:1:15313)
    at ne.Jt [as evaluate] (https://cdn.skypack.dev/-/vega-dataflow@v5.7.4-DrCzG6Luqf74SfPN5Hxw/dist=es2020,mode=imports,min/optimized/vega-dataflow.js:1:12100)
    at async ne.evaluate (https://cdn.skypack.dev/-/vega-view@v5.11.0-qj2ShFtxO2P3GWqTy2DZ/dist=es2020,mode=imports,min/optimized/vega-view.js:2:1594)

The issue is that the data url is data/cars.json. If you replace that with an absolute URL everything works:

{
  "$schema": "https://vega.github.io/schema/vega-lite/v5.json",
  "description": "A scatterplot showing horsepower and miles per gallons with binned acceleration on color.",
  "data": {"url": "https://raw.githubusercontent.com/vega/vega-datasets/next/data/cars.json"},
  "mark": "point",
  "encoding": {
    "x": {"field": "Horsepower", "type": "quantitative"},
    "y": {"field": "Miles_per_Gallon", "type": "quantitative"},
    "color": {"bin": true, "field": "Acceleration"}
  }
}

The Vega editor (and maybe node.js based image export CLI) support referring to a set of built-in datasets as if they exist under a data/ directory. vl-convert doesn't currently support this. We could probably bundle them (or remap them to an absolute URL), but that's not done currently.

I'll update this in the README and mark this as a feature request.

jonmmease commented 11 months ago

Looking at the vega-render-service, this might be possible by providing a custom loader to the Vega view instance. See

https://github.com/vega/vega-render-service/blob/6a0aaf9888175db9407b51837a0130b4bdd684ea/src/app.ts#L69-L103

Along the same lines, customizing the loader could be used to have an allow-list of domains that data may be loaded from.