quarto-dev / quarto-cli

Open-source scientific and technical publishing system built on Pandoc.
https://quarto.org
Other
3.92k stars 323 forks source link

Add vega-lite as embedding object #2885

Open cscheid opened 2 years ago

cscheid commented 2 years ago

Discussed in https://github.com/quarto-dev/quarto-cli/discussions/2868

Originally posted by **aborruso** October 15, 2022 Hi, it would be great to have in quarto for vega-lite, something like the Mermaid diagrams. To write something like ```` ``` {vega-lite} { "data": { "values": [ {"a": "C", "b": 2}, {"a": "C", "b": 7}, {"a": "C", "b": 4}, {"a": "D", "b": 1}, {"a": "D", "b": 2}, {"a": "D", "b": 6}, {"a": "E", "b": 8}, {"a": "E", "b": 4}, {"a": "E", "b": 7} ] }, "mark": "point", "encoding": { "x": {"field": "a", "type": "nominal"} } } ``` ```` to get ![image](https://user-images.githubusercontent.com/30607/195983210-85df1bd6-6fcf-49e6-a2a2-c2bdf659c683.png) Thank you
cscheid commented 1 year ago

Some notes and updates.

In order for this to work well, we would like to extend the vega grammar (very slightly) to support a file data source, so that we can read entries from the filesystem. We should start by only supporting CSV files. This is easy and works well.

However, vega only natively generates SVG (or a canvas bitmap). This makes the compilation to PDF and DocX challenging without a means to convert SVG to PDF. That conversion is trickier than it seems, especially when it comes to font rendering (which I would consider an essential feature) I have a prototype implementation of this for HTML right now, but I don't much like the resulting fragmentation of supported formats.

I'm no longer inclined to do this in 1.3: if we only need HTML support, then using an ojs cells is already possible today, and supports things like interactivity, etc.

I'm convinced now that a high-quality implementation of this feature depends on us getting a good paged-js PDF format first (which we would co-opt to get images for DocX, etc).

cscheid commented 1 year ago

A minimal deno implementation of the svg-only feature really is trivial:


import * as vega from "https://cdn.skypack.dev/vega";
import * as vegaLite from "https://cdn.skypack.dev/vega-lite";
import {csvParse} from "https://cdn.skypack.dev/d3-dsv@3";

const data = csvParse(Deno.readTextFileSync("data/seattle-weather.csv"))
const vlSpec = {
  data: {values: data},
  "mark": "bar",
  "encoding": {
    "x": {"timeUnit": "month", "field": "date", "type": "ordinal"},
    "y": {"aggregate": "mean", "field": "precipitation"}
  }
}
const view = new vega.View(vega.parse(vegaLite.compile(vlSpec).spec), {renderer: 'none'});

// generate a static SVG image
const svg = await view.toSVG();

console.log(svg);
aborruso commented 1 year ago

Hi @cscheid I'm very happy that you are making some test about it. vega-lite is in some way a standard, a tool to which many other tools refer, and since quarto is loved by people doing things with data, I think this would be a great feature.

There is a great new tool - mosaic - that in some way use that specs, which enables the user to use yaml and json vega-lite (sort-of) syntax. You probably already know it. I insert it because among its developers there are some vega-lite developers and users who seem active here too. Maybe in its code, there's something useful for integration here in quarto.

image

jgunstone commented 9 months ago

for vg.json -> png for pdf / docx : https://github.com/vega/vl-convert