lukasmartinelli / postgis-editor

An accessible PostGIS query editor and visualizer.
MIT License
193 stars 29 forks source link

Performance with large datasets #4

Open lukasmartinelli opened 8 years ago

lukasmartinelli commented 8 years ago

It seems the performance limit is actually limited by the size of the GeoJSON. At around 110MB the Electron app might crash. I tried writing to a GeoJSON file directly and then load it but it turns out performance is not that much better than my in memory objects.

The problem is more that we parse the PostgreSQL result (only once we received all rows) into real GeoJSON which might also take 2seconds. But just making the 110MB query from the database takes 5s to 10s so it might be still ok.

Also there might be a clustering optimization for points.

You can show 350k points and it works. But according to Mapbox we should be able to display millions of points.

cursor_and_sql_query_editor

baditaflorin commented 8 years ago

around 100.000 ways still decent as performance, when i load 200.000 ways i get a white screen and i have to close the app

for nodes after 200000-300000 nodes is start crushing

ryanbaumann commented 8 years ago

@lukasmartinelli Geojson-VT is the client js library that renders GeoJSON as vector tiles. For very large GeoJSON data with a large number of features or attributes, the client web worker may run out of memory while loading the data into memory and render the vector tiles. Is this what you're seeing right now?

If the data is geoJSON data is too big, a solution could be:

  1. Chunk the data into several parts, and load as different sources into the client. Note that client memory is still the upper limit as to what the client can buffer and store VT's on the fly.
  2. Render Vector Tiles using a server-side tool such as Tippiecanoe or the Mapbox Uploads API.
        for (i=0; i<5; i++){
            map.addSource("route-" + i, {
                "type": "geojson",
                "data": 'http://localhost:8080/smaller-2.geojson',
                "maxzoom" : 15,
                "buffer": 1,
                "tolerance": 5
            });
        map.addLayer({
            "id": "route-" + i,
            "type": "line",
            "source": "route-" + i,
            "layout": {
                "line-join": "round",
                "line-cap": "round"
            },
            "paint": {
                "line-color": "red",
                "line-width": 1
            }
        });
    }

If the data renders but the map performs poorly when the user interacts with it, clustering a GeoJSON point source or simplifying a vector tile source will help.

lukasmartinelli commented 8 years ago

@lukasmartinelli Geojson-VT is the client js library that renders GeoJSON as vector tiles. For very large GeoJSON data with a large number of features or attributes, the client web worker may run out of memory while loading the data into memory and render the vector tiles. Is this what you're seeing right now?

Exactly that happened when I last did memory profiling.. but Geojson-VT is impressive.

If the data is geoJSON data is too big, a solution could be:

Chunk the data into several parts, and load as different sources into the client. Note that client memory is still the upper limit as to what the client can buffer and store VT's on the fly. Render Vector Tiles using a server-side tool such as Tippiecanoe or the Mapbox Uploads API.

So if I want to fix this problem I could create the vector tiles on the Electron main process and then serve actual vector tiles to the chromium client? Essentially a PostGIS to vector tile tileserver on demand like tessera. I could then use PostGIS to ensure that not too much is loaded at once (or at least in the possibilities of vector tiles).

What do you think about that?

Serving the GeoJSON is not ideal - was just my quickest hack so far. And I was impressed that the client can handle the 100MB file.

The idea with the multiple sources is a nice one! I was astonished before that by having multiple sources I can display much more what is otherwise possible by having a single source for displaying millions of points - and now you explained the reason 👍

If you are already here and know a lot about Mapbox GL JS. I have a question if you don't mind 😁 Now the batch API is gone in Mapbox GL JS in the latest release (which I use in PostGIS editor as well).

How should batch updates to a style be done in the newest version? Is that now done internally automatically and we don't have to worry?

ryanbaumann commented 8 years ago

@lukasmartinelli Yep, batch updates are done automatically since GL JS v0.17

So if I want to fix this problem I could create the vector tiles on the Electron main process and then serve actual vector tiles to the chromium client? Essentially a PostGIS to vector tile tileserver on demand like tessera. I could then use PostGIS to ensure that not too much is loaded at once (or at least in the possibilities of vector tiles).

:+1: Sounds solid. Mapbox GL JS can handle almost any data size sourcing from Vector Tiles.