pbugnion / gmaps

Google maps for Jupyter notebooks
https://jupyter-gmaps.readthedocs.io/en/stable/
Other
760 stars 146 forks source link

how to deal with a large export.html? #300

Open markbroich opened 5 years ago

markbroich commented 5 years ago

Hi Pascal,

First of all: Thank you for developing and sharing gmaps.

My question in a nutshell: how to deal with a large export.html (that appears to be too big for my browser to open).

Details: I am running into a scalability issue and I was wondering if you could give me pointers on how to best precede. I am trying to use google maps as the front end for exploring a large number of polygons (in a geojson) across a large region.

When I run my code (below) on the rather large geojson the geojson_layer = gmaps.geojson_layer(geometry) command takes ~10 min and the resulting html (after export) is ~660 MB. However, I can not open such a large html in a browser or embed it in a webpage.

What could I do to make displaying (a large number of polygons across a large area) workable?

Thank you for any pointers and for creating gmaps. Regards, Mark

my py code subset:

with open(‘my.geojson') as f: 
    geometry = json.load(f)
    geojson_layer = gmaps.geojson_layer(geometry)
    fig.add_layer(geojson_layer)
    embed_minimal_html(‘export.html', views=[fig])

by now I also explored this approach: ( https://github.com/nholmber/google-maps-statistics ) and while the resulting html is now tiny I am running into the following issue:

I am trying to display a large GeoJson file (starting with the Google Maps Java script API ‘loadGeoJson' command; https://developers.google.com/maps/documentation/javascript/datalayer ). My GeoJson file can be 100-800 MB with 500k to 1.500k polygons. I did scale back the precision of the geolocation to 3 digits but I did not manage to display a Google Mpas html in my browser when created from a GeoJson with > ~15k rows (polygons).

I found online that one way of doing this is to dynamically load from the Json to the browser depending on the region viewed and the zoom level but I did not yet find code examples or a description on how to do that.

pbugnion commented 5 years ago

Thanks for raising this.

It'd be useful to know the following:

  1. does the map display in Jupyter?
  2. when you were running your code snippet that runs embed_minimal_html, was that in the notebook or in a separate Python script? Sometimes ipywidgets embed_minimal_html has trouble distinguishing the data that it needs just for the view you are trying to show when run in the notebook, which can result in large file sizes.

In terms of reducing GeoJSON size, I don't think reducing float precision is going to help much with export size. I would look at reducing the quality of the GeoJSON (i.e. reducing the number of nodes in each polygon). I can remember using an online service to do this in the past, but I can't remember where.

I found online that one way of doing this is to dynamically load from the Json to the browser depending on the region viewed and the zoom level but I did not yet find code examples or a description on how to do that.

Ultimately this is the only solution that will let you retain high fidelity. This is not currently possible with jupyter-gmaps. Even if you did that entirely in JavaScript, you would still have to have some clever server-side code that only returns part of the GeoJSON.

markbroich commented 5 years ago

Thanks for your quick response Pascal,

I am re running the code (this time for sure in a notebook [but on my laptop]. I will let you know what I find re 1) and 2) as soon as I got the results (e.g. tomorrow morning). Simplifying the polygons is 'not a good' option as I would need them to be rather exact (e.g. with lat long rounded to 3 decimal places) as I am working on flood mapping (to give some context).

As for the fidelity I would like users of my product to be able to explore the results w/o too much delay (they would likely be ok with some delay as this is not an e.g. real time traffic navigation situation). As for the server side, do you know someone I could ask for pointers? Thank you for your time. Mark

markbroich commented 5 years ago

Hi Pascal,

In response to your questions: 1) the full size map (zoom set to entire extent) after add_layer of the full size GeoJSON does not display in the notebook. 2) I ran embed_minimal_html on a 'larger' AWS EC2 in py code (below). The result file was large and I could not open it in my browser. I did not manage to create an html file when running in a notebook on my laptop.

For now I will create result subsets and link to them (as a simple and quick solution). Please let me know your thoughts/ give me pointers as you see fit. Thanks for your work and help.

Cheers, Mark

the Py code I used: from ipywidgets.embed import embed_minimal_html import os import gmaps import time import json

gmaps.configure(api_key='xxx')

map_coordinates = (-31.4, 145.8) zoom_level=5.1 fig = gmaps.figure(center=map_coordinates, zoom_level=zoom_level)

with open('/home/ubuntu/poly/2018_autumn_max.geojson') as f: geometry = json.load(f) geojson_layer = gmaps.geojson_layer(geometry) fig.add_layer(geojson_layer)

embed_minimal_html('/home/ubuntu/html/fall.html', views=[fig])

pbugnion commented 5 years ago

The only solution I can think of would be to build a web app. The browser client could monitor the map for zoom and pan events and send the coordinates of the current viewport to the server. The server would then be responsible for figuring out which features fit into that viewport and sending them back to the client. At low zoom values (when the viewport includes many features), the server could either downsample the edges or read from a pre-compiled down-sampled geojson.

I don't know of any tool that does this out of the box, but there may well be one.

markbroich commented 5 years ago

Thanks for your feedback. There may also be a memory leak: when I loop over the code (provided before) feeding it tiny GeoJSON subsets, the resulting html keep getting bigger into the 100MB size. The way I got around it was to call the py code from a bash loop.