unicef / magicbox-maps

Map mobility data in a NodeJS + React front-end application with data served by magicbox-open-api
http://magicbox-maps.azurewebsites.net
BSD 3-Clause "New" or "Revised" License
11 stars 11 forks source link

Replace country level points with heatmap #24

Closed mikefab closed 6 years ago

mikefab commented 6 years ago

@sherl0cks, per School Data from API Has A Whole Bunch of nulls, we've refactored so that only columns useful to coloring the points (connectivity speed) are fetched. Information specific to a school is requested on click of point.

screen shot 2018-01-24 at 12 57 01 pm

You can see a first draft here. Mauritania loads reasonably quickly with just 2,936 schools. Colombia, with its 49,020 schools, takes longer and slows down the app.

@ayanez17 is experimenting with alternative ways to represent the data. First try of a heatmap: whatsapp image 2018-01-24 at 12 23 16 pm

We're also considering a sampling technique where only a fraction of all points (none clickable) are displayed at the country level, and then all points displayed when the user clicks on a state. You can see an example of that here when you click on Brazil.

Do you have any suggestions on other ways to represent the data?

sherl0cks commented 6 years ago

Thanks for the update @mikefab. have you profiled this using something like https://developers.google.com/web/tools/chrome-devtools/rendering-tools/? My experiments showed that latency was mostly associate with leaflet scripting itself (e.g. creating a new point/marker), not via data access. FWIW - I got an order of magnitude better results using mapbox over leaflet.

Once we had some profiling data, then we can have a more informed discussion on this one.

alfredoxyanez commented 6 years ago

@sherl0cks How many points did you render in your experiment?

sherl0cks commented 6 years ago

100,000. I was largely using Jupyter integrations to create a map, save it to HTML, and then serve the static HTML. Using this approach, I could render 100,000 points in ~4.5 second on mapbox and it ~10 second with leaflet, so half the time, not order of magnitude. That said, this was using a clustered visualization, not displaying all the points.

Here is a relevant issue: https://github.com/python-visualization/folium/issues/803#issuecomment-354429679

I can recreate these for you guys and share tomorrow. I'm in the middle of a few other things today

sherl0cks commented 6 years ago

Also to the question of data representation, I'd like to explore the extent to which vanilla GeoJSON can be used at the serving tier with a tool like http://geoserver.org/. The GeoJSON representations can be optimized to each mapping use case (e.g. delete properties not needed). Once you are in a GeoJSON representation we can work with guidance like https://www.mapbox.com/help/working-with-large-geojson-data/.

The idea here would be to shift the focus from custom API dev to optimizing UX as well as building data pipelines to create materialized vies of GeoJSON in GeoServer for different use cases.

alfredoxyanez commented 6 years ago

That would be great! Please let me know when would be best for you tomorrow.

alfredoxyanez commented 6 years ago

I agree we could talk more about how we are delivering data through the API. I just made changes and the 2DToGeoJson function ( which makes geojson from a 2d array) so that it only has feature for which the value is not null.

mikefab commented 6 years ago

Fredy experimented with clusters using: react-leaflet-markercluster. We're testing it on staging.

image

image

sherl0cks commented 6 years ago

ok thanks. my profiling shows ~23 seconds to load the first version and ~60 second to load the cluster version. In both cases, 90%+ of the time is spent in leaflet.

In the experiments I did, there was distinction between Leaflet's MarkerCluster and FastMarkerCluster with an order of magnitude difference between the two. Apparently I did a poor job saving my work, so I will need to recreate those experiments.

mikefab commented 6 years ago

@sherl0cks, we're looking at map gl and deck.gl. We've also been working on a webgl version. The first draft is hosted here.

screen shot 2018-02-06 at 10 27 14 am

sherl0cks commented 6 years ago

Interesting. Seems like deck.gl is built around mapbox - I'm supportive of that. The link 404's for me though.

mikefab commented 6 years ago

@sherl0cks, thanks for catching that. The link is updated in the comment. http://school-mapping-development.azurewebsites.net

sherl0cks commented 6 years ago

Very cool. Profiling shows a ~4s response time for columbia. That's a big improvement! It appears deck.gl is nicely integrated with mapbox vector tiles, so there is opportunity to move the map tiles over to mapbox, which will probably also have a performance improvement.

And once in the mapbox ecosystem, there a lot of tools that are interesting to your datascientists that integrate with Jupyter e.g. https://github.com/mapbox/mapboxgl-jupyter. This would allow folks to prototype with different visualizations really cheap in Jupyter, and then use much of the same technology to build web apps as you are already doing