Closed mikefab closed 6 years ago
Thanks for the update @mikefab. have you profiled this using something like https://developers.google.com/web/tools/chrome-devtools/rendering-tools/? My experiments showed that latency was mostly associate with leaflet scripting itself (e.g. creating a new point/marker), not via data access. FWIW - I got an order of magnitude better results using mapbox over leaflet.
Once we had some profiling data, then we can have a more informed discussion on this one.
@sherl0cks How many points did you render in your experiment?
100,000. I was largely using Jupyter integrations to create a map, save it to HTML, and then serve the static HTML. Using this approach, I could render 100,000 points in ~4.5 second on mapbox and it ~10 second with leaflet, so half the time, not order of magnitude. That said, this was using a clustered visualization, not displaying all the points.
Here is a relevant issue: https://github.com/python-visualization/folium/issues/803#issuecomment-354429679
I can recreate these for you guys and share tomorrow. I'm in the middle of a few other things today
Also to the question of data representation, I'd like to explore the extent to which vanilla GeoJSON can be used at the serving tier with a tool like http://geoserver.org/. The GeoJSON representations can be optimized to each mapping use case (e.g. delete properties not needed). Once you are in a GeoJSON representation we can work with guidance like https://www.mapbox.com/help/working-with-large-geojson-data/.
The idea here would be to shift the focus from custom API dev to optimizing UX as well as building data pipelines to create materialized vies of GeoJSON in GeoServer for different use cases.
That would be great! Please let me know when would be best for you tomorrow.
I agree we could talk more about how we are delivering data through the API. I just made changes and the 2DToGeoJson function ( which makes geojson from a 2d array) so that it only has feature for which the value is not null.
Fredy experimented with clusters using: react-leaflet-markercluster. We're testing it on staging.
ok thanks. my profiling shows ~23 seconds to load the first version and ~60 second to load the cluster version. In both cases, 90%+ of the time is spent in leaflet.
In the experiments I did, there was distinction between Leaflet's MarkerCluster
and FastMarkerCluster
with an order of magnitude difference between the two. Apparently I did a poor job saving my work, so I will need to recreate those experiments.
@sherl0cks, we're looking at map gl and deck.gl. We've also been working on a webgl version. The first draft is hosted here.
Interesting. Seems like deck.gl is built around mapbox - I'm supportive of that. The link 404's for me though.
@sherl0cks, thanks for catching that. The link is updated in the comment. http://school-mapping-development.azurewebsites.net
Very cool. Profiling shows a ~4s response time for columbia. That's a big improvement! It appears deck.gl is nicely integrated with mapbox vector tiles, so there is opportunity to move the map tiles over to mapbox, which will probably also have a performance improvement.
And once in the mapbox ecosystem, there a lot of tools that are interesting to your datascientists that integrate with Jupyter e.g. https://github.com/mapbox/mapboxgl-jupyter. This would allow folks to prototype with different visualizations really cheap in Jupyter, and then use much of the same technology to build web apps as you are already doing
@sherl0cks, per School Data from API Has A Whole Bunch of nulls, we've refactored so that only columns useful to coloring the points (connectivity speed) are fetched. Information specific to a school is requested on click of point.
You can see a first draft here. Mauritania loads reasonably quickly with just 2,936 schools. Colombia, with its 49,020 schools, takes longer and slows down the app.
@ayanez17 is experimenting with alternative ways to represent the data. First try of a heatmap:
We're also considering a sampling technique where only a fraction of all points (none clickable) are displayed at the country level, and then all points displayed when the user clicks on a state. You can see an example of that here when you click on Brazil.
Do you have any suggestions on other ways to represent the data?