vega / vl-convert

Utilities for converting Vega-Lite specs from the command line and Python
BSD 3-Clause "New" or "Revised" License
96 stars 12 forks source link

vl-convert is stuck when saving a spec with images produced by altair_tiles #63

Closed binste closed 1 year ago

binste commented 1 year ago

I'm using 0.10.2 and the code from #59 works as expected. However, when I try to render an Altair chart produced by the development version of altair_tiles then the kernel/vl-convert gets stuck:

image

As there is no error message I don't know how to debug this. @jonmmease Do you have any idea what could cause this? I can download a PNG from the Vega Editor: Open the Chart in the Vega Editor

Code example

import vl_convert as vlc

# This specification is the same as in the Vega editor link above
tile_chart_spec = r"""{
  "$schema": "https://vega.github.io/schema/vega-lite/v5.8.0.json",
  "config": {
    "view": {
      "continuousHeight": 300,
      "continuousWidth": 300
    }
  },
  "datasets": {
    "empty": [
      {}
    ]
  },
  "height": 600,
  "layer": [
    {
      "data": {
        "name": "tile_list",
        "sequence": {
          "as": "a",
          "start": 0,
          "stop": 18
        }
      },
      "encoding": {
        "url": {
          "field": "url",
          "type": "nominal"
        },
        "x": {
          "field": "x",
          "scale": null,
          "type": "quantitative"
        },
        "y": {
          "field": "y",
          "scale": null,
          "type": "quantitative"
        }
      },
      "mark": {
        "clip": true,
        "height": {
          "expr": "tile_size + 1"
        },
        "type": "image",
        "width": {
          "expr": "tile_size + 1"
        }
      },
      "projection": {
        "center": [
          0,
          50
        ],
        "rotate": [
          -10,
          0,
          0
        ],
        "scale": 400,
        "type": "mercator"
      },
      "transform": [
        {
          "as": "b",
          "calculate": "sequence(0, 18)"
        },
        {
          "flatten": [
            "b"
          ]
        },
        {
          "as": "url",
          "calculate": "'https://tile.openstreetmap.org/' + zoom_ceil + '/' + ((datum.a + dii_floor + tiles_count) % tiles_count) + '/' + (datum.b + djj_floor) + '.png'"
        },
        {
          "as": "x",
          "calculate": "datum.a * tile_size + dx + (tile_size / 2)"
        },
        {
          "as": "y",
          "calculate": "datum.b * tile_size + dy + (tile_size / 2)"
        },
        {
          "filter": "((datum.a + dii_floor + tiles_count) % tiles_count) >= 0 && (datum.b + djj_floor) >= 0 && ((datum.a + dii_floor + tiles_count) % tiles_count) <= tiles_count && (datum.b + djj_floor) <= tiles_count"
        }
      ]
    },
    {
      "data": {
        "format": {
          "feature": "countries",
          "type": "topojson"
        },
        "url": "https://cdn.jsdelivr.net/npm/vega-datasets@v1.29.0/data/world-110m.json"
      },
      "encoding": {
        "fill": {
          "field": "id",
          "legend": null,
          "type": "quantitative"
        }
      },
      "mark": {
        "fillOpacity": 0.1,
        "stroke": "orange",
        "strokeWidth": 2,
        "type": "geoshape"
      },
      "projection": {
        "center": [
          0,
          50
        ],
        "rotate": [
          -10,
          0,
          0
        ],
        "scale": 400,
        "type": "mercator"
      }
    },
    {
      "data": {
        "name": "empty"
      },
      "encoding": {
        "x": {
          "value": 0
        },
        "y": {
          "value": {
            "expr": "height"
          }
        }
      },
      "mark": {
        "align": "left",
        "dx": 3,
        "dy": -8,
        "text": "(C) OpenStreetMap contributors",
        "type": "text"
      }
    }
  ],
  "params": [
    {
      "name": "base_tile_size",
      "value": 256
    },
    {
      "expr": "400",
      "name": "pr_scale"
    },
    {
      "expr": "log((2 * PI * pr_scale) / base_tile_size) / log(2)",
      "name": "zoom_level"
    },
    {
      "expr": "ceil(zoom_level)",
      "name": "zoom_ceil"
    },
    {
      "expr": "pow(2, zoom_ceil)",
      "name": "tiles_count"
    },
    {
      "expr": "base_tile_size * pow(2, zoom_level - zoom_ceil)",
      "name": "tile_size"
    },
    {
      "expr": "invert('projection', [0, 0])",
      "name": "base_point"
    },
    {
      "expr": "(base_point[0] + 180) / 360 * tiles_count",
      "name": "dii"
    },
    {
      "expr": "floor(dii)",
      "name": "dii_floor"
    },
    {
      "expr": "(dii_floor - dii) * tile_size",
      "name": "dx"
    },
    {
      "expr": "(1 - log(tan(base_point[1] * PI / 180) + 1 / cos(base_point[1] * PI / 180)) / PI) / 2 * tiles_count",
      "name": "djj"
    },
    {
      "expr": "floor(djj)",
      "name": "djj_floor"
    },
    {
      "expr": "round((djj_floor - djj) * tile_size)",
      "name": "dy"
    }
  ],
  "width": 600
}
"""
png_data = vlc.vegalite_to_png(tile_chart_spec)
jonmmease commented 1 year ago

Thanks for the report and repro, I'll take a look

jonmmease commented 1 year ago

When I open the example in the Vega editor (Open the Chart in the Vega Editor), it takes around 5 seconds for the background image to fill in. Is that expected at the moment?

In the vega editor it looks like tile_list dataset has hundreds of rows, do you think all of these images are being loaded by the browser? I'm wondering if vl-convert is loading all of these images serially (rather than in parallel the way a browser might) and so it's just taking a long long long time.

binste commented 1 year ago

Thank you for your help! The loading time is indeed not great although I find it to be much faster in a Jupyter notebook. Not sure if/how I can improve that.

Hmm good point. tile_list might still contain too many URLs and I try to discard all URLs which are not shown in the chart in the filter transform (Python code is here). If I display the chart in the browser, Vega seems to only load the images which are shown anyway so I have not investigated if I discard enough or if I could get rid of more. I need to have another look at that logic.

For my understanding, does vl-convert load all images in tile_list before the filter is applied or is it converted to an SVG, i.e. after all filters are applied, and then all URLs which are in that SVG are downloaded?

binste commented 1 year ago

You're right. This might not be a vl-convert issue but something I should fix on the altair_tiles side. I leave this open until I figured it out if that's ok but no need for you to further investigate.

I tested it with a spec that should just show 2 tiles (I added a space between them so it's visible): Open the Chart in the Vega Editor

However, even in the browser it loads more tiles and then the SVG contains more URLs as well as just the two so I probably need to improve the filtering.

binste commented 1 year ago

I was able to improve the filtering. Here is an example of a chart with only 2 tiles in tile_list (I again added a space between the tiles so they are easier to spot). In VS Code, the chart displays almost instantly. Open the Chart in the Vega Editor

For this chart, vl-convert takes 18 seconds to save the chart and the resulting chart does not contain the tiles, only the geoshapes:

image

jonmmease commented 1 year ago

vl-convert uses Vega to generate the SVG and uses resvg to render the SVG to a PNG. My guess is that any image that shows up in the SVG will be loaded by resvg, even if it doesn't overlap with the view.

This 2-tile example is a good starting point to investigate what vl-convert is doing. I'll let you know what I find!

jonmmease commented 1 year ago

Fix in progress in https://github.com/vega/vl-convert/pull/64. The main issue is that tile.openstreetmap.org requires the requester have a user agent string set.

With this fix, you're final 2-tile example takes well under 1 second to save to PNG (200-400ms on my machine).

The original example takes close to 3 minutes, but it does complete now.

many_tiles

Hopefully there's a path toward including less tiles in the resulting SVG, because I don't think vl-convert will be able to do this much faster given the architectural limits of how it integrates with resvg (it's not possible to parallelize image downloads).

jonmmease commented 1 year ago

Should be fixed in 0.10.3

binste commented 1 year ago

🥳 This is awesome, thank you @jonmmease for solving it! I'm relieved that this is fixed.

I was also able to include only the visible tiles in the resulting SVG and now the original spec takes 600ms on my Macbook which is great.