Improve general performance when working with large datasets

DigitalCommons / open-data-and-maps

Deprecated: Implementation of Linked Open Data by the Solidarity Economy Association

6 stars 1 forks source link

Improve general performance when working with large datasets #161

Closed joebillings closed 5 years ago

joebillings commented 5 years ago

Datasets larger than abut 4000 seem to suffer quite badly in almost all areas – loading, general moving around, selection.

joebillings commented 5 years ago

I'm going to look into server side processing of the data to try and ease the strain, particularly on mobile devices.

joebillings commented 5 years ago

https://github.com/rclark/server-side-leaflet

ColmMassey commented 5 years ago

This sounds quite different from loading the mimumum data necessary to display the map, leaving the othet data to be loaded, either after the map is created, or when required.

joebillings commented 5 years ago

I'll continue looking but the problem doesn't seem to be the amount of data as a whole, it's the number of data points that's the issue.

joebillings commented 5 years ago

Things to investigate:

Dynamic server side clustering: https://geovation.github.io/dynamic-server-side-geo-clustering
Storing lat/lng as a geohash (up to 50% smaller): http://www.movable-type.co.uk/scripts/geohash.html

joebillings commented 5 years ago

Virtuoso now supports GeoSparql which would allow us to make location based queries that could ease the load. The only way to make this work would be to zoom in to a specific location when the app loads and request the inits from that area. The rest would get loaded as the user moves the map around. I'm not sure if this would work on the larger global datasets though.

ColmMassey commented 5 years ago

Is https://github.com/Leaflet/Leaflet.markercluster the library we are currently using? From the documentation it doesn't look like ~9k inititiatves should cause performance issues.

joebillings commented 5 years ago

Yeah that's the one.

ColmMassey commented 5 years ago

https://leaflet.github.io/Leaflet.markercluster/example/marker-clustering-realworld.50000.html has 50k markers and it is lightening fast, so can't see how it could be the # of leaflet markers.

joebillings commented 5 years ago

This is loading much faster now. I've moved the sorting to the server – the SPARQL query now sorts by country and then name. I've also moved the country name conversion to the sausage machine – previously the two character iso code was being passed through from the original data and then mapping to the the full name in the JS.

Other things that could potentially be improved:

We could try loading location data only and then getting the rest of the data when we need it. This may make initiative selection a little sluggish.
We could store the results in local storage to stop them from being dowloaded each time the page loads.