WFP-VAM / prism-app

PRISM is an interactive map-based dashboard that simplifies the integration of geospatial data on hazards, along with information on socioeconomic vulnerability
MIT License
45 stars 32 forks source link

Performance - improve handling of large boundary files (eg. RBD) #467

Closed ericboucher closed 6 months ago

ericboucher commented 2 years ago

We should augment the caching time allowed for boundary files to speed up app initialization

wadhwamatic commented 2 years ago

@ericboucher - I'm not sure if what you have in mind will help with an issue we have with large boundary files with complex geometries. For example, there are three admin boundary files used in the upcoming RBD deployment ranging from 14 - 40 mb in size. Loading time is slow as you'd expect (see https://demo-prism-rbd-v4.surge.sh).

What are your thoughts?

ericboucher commented 2 years ago

My first thought would be to simplify the boundaries with the script that @DanielJDufour created. That would probably help quite a bit already. We might also want to look at cloud optimized geojson

ericboucher commented 2 years ago

In some cases, we also seem to be loading admin boundaries that do not get used, for example, http://prism-510.surge.sh/, we load admin_boundaries.json but I am not sure it actually gets used

PysarenkoDS-BWT commented 2 years ago

I'm try to save save files locally, but not have some results Try save in LocalStorage, SessionStorage, Cookies but for all this files to large Also I'm try to use Cache this is experimental feature but with this large response it not work too

PysarenkoDS-BWT commented 2 years ago

Investigate how we can use topojson package converting every GeoJson file in TopoJson file doesn't make sense because single TopoJson file takes more space then GeoJson (for example try convert ecu_admbnda_adm2 to TopoJson and file size increase from 37.2 to 41.6 MB)

But we can covert some GeoJson file in one TopoJson, I'm try this and convert ecu_admbnda_adm1 and ecu_admbnda_adm2 their size is 17.9 + 37.2 = 55.1 MB and when I convert it their size in TopoJson is 41.7

So that we can convert and store 1 TopoJson file for each country and on the front side convert them back to GeoJson files to reduce the size of the downloaded data from server.

wadhwamatic commented 9 months ago

@ericboucher - I think we should look into using vector tiles for boundaries and potentially generate them using tippecanoe https://github.com/mapbox/vector-tile-spec/ https://github.com/felt/tippecanoe

ericboucher commented 9 months ago

Also relevant and "new" - https://www.openstreetmap.org/user/daniel-j-h/diary/402706

laurentS commented 7 months ago

I had a quick look at this for rbd and tried removing admin boundary layers from defaultDisplayBoundaries in prism.json to see impact on load time and memory (according to the firefox profiler) in fully local conditions (so no impact of network slowdowns):

It's just a hunch, but it's likely that vector tiles, as suggested above, would indeed help with faster inits, and most probably with memory usage as well. This would probably require changing the code that handles clicks and such to determine where clicks happened, zoom to a region, etc...

ericboucher commented 7 months ago

https://maplibre.org/maplibre-gl-js/docs/guides/large-data/ lists a few ideas as well. I tried running https://reducegeojson.radicaldata.org/ to reduce file size in the PR I just opened, we'll see how it goes. It also gives a fw other ideas and tiling options

https://github.com/WFP-VAM/prism-app/pull/1054

ericboucher commented 7 months ago

A few ideas (ordered):

ericboucher commented 7 months ago

It seems that rasterized borders are available for level 0,1,2 in HDC, https://data.earthobservation.vam.wfp.org/stac/#/collections/wldrfh_admin_lvl2?.language=en

laurentS commented 6 months ago

For what it's worth, I gave msgpack a quick try for something else, and it's a tradeoff:

As an alternative, there might be a way to use columnar JSON, which something like polars uses, ie, instead of an array of objects:

[{
  "key1": "value11",
  "key2": "value12",
},
{
  "key1": "value21",
  "key2": "value22",
}
]

use an object with an array of values for each key:

{
"key1": ["value11", "value21"],
"key1": ["value12", "value22"]
}

If all objects have the same shape, the resulting JSON file is about 10-12x smaller. This transformation can be done at build time for the server, and then in a few milliseconds on the browser after loading the JSON. The benefit is faster network transfer and no extra dependencies (the code to convert is ~10 lines of js). It would be worth it if network takes more than ~50ms. Only works for regular structures like the ones in public/data/rbd/ica.json for instance. Probably useless for admin boundary files.

ericboucher commented 6 months ago

After digging into it it seems that geojson is not the main issue here actually, but the fact that the files are way too precise. I am closing this issue in favor of https://github.com/WFP-VAM/prism-app/issues/1057 and suggest to better source the boundary files and/or use a tool like mapshaper.org to simplify the geometries.