debrief / pepys-import

Support library for Pepys maritime data analysis environment
https://pepys-import.readthedocs.io/
Apache License 2.0
5 stars 5 forks source link

Ways to provide offline background mapping #1131

Open robintw opened 2 years ago

robintw commented 2 years ago

I'm using this issue as a place to record my thoughts about various ways of providing offline background mapping.

The problem

A user will be using Jupyter notebooks to perform analysis on ship data stored in Pepys. They want to be able to display the ship data on a map (probably using Folium as that has excellent integration with geopandas and is easy to plot the ship data in - but it could be a similar library) and contextualise the data by showing some sort of background map. However, this computer will be disconnected from the internet for security reasons, so web-based background maps cannot be used.

Using folium offline

This has already been solved, by creating a module called offline_folium which will download the required JS/CSS files and then alter folium to use the downloaded versions. See here for more details.

Background map data wanted

A variety of background map data could be useful to analysis. This includes:

The areas covered by these data could be small parts of the UK, the whole of the UK, or the whole world.

Options

GeoJSON coastline data

This is the simplest option. Some GeoJSON files could be provided with Pepys which contain coastline data for various regions (Scotland, UK, Europe, World). The analyst decides which of these he wants to display, and adds a line like display_coastlines('europe', m) to his notebook (where m is a reference to the folium Map object that they have created). This uses the built-in folium GeoJSON support, and displays the GeoJSON on the map. It can be styled in various ways (different widths of line, shading of land area etc).

This data can be provided at various levels of detail and corresponding size of data. A quick experiment has shown that the UK coastline can take anything from < 1Mb to > 10Mb of GeoJSON data to store, depending on the level of generalisation. At a low level of detail, the whole world's coastlines can fit in 10Mb of GeoJSON, but this may be too coarse to be useful for analysts (as ships in the Solent, for example, could be shown as being within the Isle of Wight's coarsened coastline).

Advantages:

Disadvantages:

Nautical charts

UKHO charts are available to the analysts as GeoTIFF files. These can be displayed as a background layer on the map, and would give analysts a lot of context for the sea areas (plus some for the coastal land areas). There are a couple of ways of doing this:

Loading an individual chart

We could provide a folder of charts on the server that hosts Pepys, and the user could select a specific chart to load (this would require them knowing the name of the chart etc, but they may be used to choosing a chart anyway in software like ArcGIS, or we could provide some sort of lookup to guess a good chart for the area). We would then display it on the folium map.

This would require extending folium to be able to display GeoTIFF files. There is a plugin for Leaflet maps to allow displaying GeoTIFFs (see here), and it would be relatively easy to extend folium to be able to use this. However, the big problem would be that the folium JS code can't access the file-system on the computer - so the maps would have to be hosted somewhere that can serve them over HTTP. Although this sounds like a deal-breaker, a simple HTTP server could be run on the server that hosts the master copy of Pepys, or a local server could be run on each analyst's machine (only when the Jupyter notebook is running - it could be started in a separate process by the Pepys Admin tool). Both of these would just point at the folder containing the GeoTIFF files, and serve them all over HTTP.

For efficient display we would want to convert the GeoTIFFs to Cloud Optimized GeoTIFFs (COGs), but that is very easy to do.

Advantages:

Disadvantages:

Loading a mosaic of multiple charts

We could provide a way of serving a tile layer like the standard OpenStreetMap tiles (lots of little square images that together make a map, and that are available at a range of resolutions) but composed of a mosaic of the UKHO charts. This could be done with a tool like TiTiler. This is a slightly more involved server to run - so should probably only be run in a central place on the network - but is all written in Python and is pretty easy to deploy. In simple terms, it takes a list of GeoTIFF files (well, COGs actually - to make it efficient) and produces tiles on the fly at the relevant zoom level, and deals with mosaicing everything together.

As this is a tile layer just like OpenStreetMap, it doesn't require any extensions to folium, as folium can already display tile layers.

Advantanges:

Disadvantages:

More detailed vector mapping

Not sure how relevant this is, as most of the vector data we have will be over land. However, it could provide a nicer way of dealing with more detailed vector coastline data (compared to the GeoJSON option). This would involve storing vector data (such as coastlines, but potentially much more - right up to basically all the OpenStreetMap data) in the PostGIS database, and then serving it up as either raster tiles (like OpenStreetMap tiles) or vector tiles (little chunks of vector data, that can be styled on the fly).

Raster tiles

To create raster tiles on the fly we'd need to set up a OpenStreetMap tile server, or similar. I don't have that much experience with this, so I'll leave this there. Note from IanM: I've looked into this a couple of times. The last time I looked, it was an 80Gb download, then load that into a Postgres instance, then (optionally) initialise a tile-cache for likely areas of interest, and lastly run a web-server to serve the tiles. I believe client Tech Support staff could handle this.

Vector tiles

PostGIS can create vector tiles on the fly now (exciting new feature!), and there is a very lightweight server that can go in front of PostGIS called pg_tileserv which can serve them easily across a network. So, we could run this on the same server that runs Postgres, and get folium to connect to it. Unfortunately folium doesn't support vector tiles by default, but there is a Leaflet extension that does - and we could write an extension for folium to work with this Leaflet extension.

Advantages:

Disadvantages:

Concluding thoughts

This is the result of various bits of research I've done, plus other experience I have (for example, I've been using and deploying TiTiler in my other work at the moment). I don't know which is best, as it will depend very much on the client's requirements. I hope these notes on advantages and disadvantages will be useful - and I hope the notes will help any future developer implement one of these solutions. Happy to discuss further, just wanted to get these thoughts out of my mind and into a semi-permanent place.

IanMayo commented 2 years ago

Note: by coincidence I was made aware of this data source last week: https://www.marineregions.org/downloads.php