mrmap-community / mrmap

Spatial Service Registry
https://mrmap.rtfd.io/en/master/
MIT License
10 stars 6 forks source link

Atom Feed Client #35

Closed jokiefer closed 3 years ago

jokiefer commented 3 years ago

Atom Feed

The atom feed itself is technically the successor of RSS feed and was introduced in 2005. RSS and atom are pretty similar to each other, whilst atom is capable of e.g. providing embedded HTML, which can produce nicer rendering on client side. RSS is only capable of plain text, without any formatting or further embedded HTML content.

Basically an atom feed may look like this


<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">

  <title>Beispiel Feed</title>
  <link href="http://www.beispiel.org/"/>
  <updated>2005-12-13T17:30:01Z</updated>
  <author>
    <name>Hannes Schmidt</name>
  </author>
  <id>urn:uuid:60a76c80-d399-11d9-b93C-0003939e0af6</id>

  <entry>
    <title>Atom Feed Beispiel</title>
    <link href="http://beispiel.org/2005/12/13/atom05"/>
    <id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>
    <updated>2005-12-13T17:30:01Z</updated>
    <summary>Textzeile</summary>
  </entry>

</feed>

Atom Feed and INSPIRE

Since 2012 INSPIRE expects services to be downloadable using something different, than a desktop gis client (like QGSI). Somehow, the solution ended up being a simple web client, which receives an atom feed url, and extracts information based on the given feed.

The user is then able to download the data, which is provided by the underlying service of the atom feed.

Implementations

There are numerous implementations of this technique out there. Just a few examples

Since we are working on a mapbender like successor, we may follow the current Geoportal RLP approach. We may use other techniques and implement further enhancements

GDI-DE technical guidance

The GDI-DE has published a document regarding the implementation of atom feed client. It may be useful to read beforehand.

Atom Feed GDI examples

Requirements

  1. A new django app has to be created called atom
  2. The atom feed client landing page is callable using a /atom-feed route
    1. Additional parameters are
      1. q (query)
        • optional
        • holds multiple keywords, separated using + for the minimal catalogue
        • example: /atom-feed?q=test+keyword+search
  3. The atom feed client for a specific service is callable using /atom-feed/<id>
  4. The client is not embedded into the MrMap UI, but - just as the HTML Metadata - accessible for everyone, even guest users
  5. A new model has to be created called AtomFeedDownload, holding the following attributes
    1. id (UUID - no auto increasing integer)
    2. metadata (ForeignKey, no related_name)
    3. zip_uuid (UUID)
    4. timestamp (DateTime)

Atom Feed Generator

  1. The atom feed documents will not be pre-calculated. Since they are rather small, they will be generated on request
  2. A new AtomFeedGenerator has to be created, which may be placed in the project folder /service/helper/atom
    1. this generator must generate a xml document, based on the available information from a metadata record
  3. The catalogue API has to be extended
    1. Next to the existing attributes of a catalogue metadata entry, another attribute called atom_feed_url must be set
    2. atom_feed_url shall follow the structure /service/metadata/<id>/atom-feed
    3. A call of this atom_feed_url must generate the corresponding atom feed document

Landing page of client (minimal catalogue)

  1. If none of the optional parameters are given, the user will be greeted by a field input, just like a search bar
    1. Any input on this search bar will perform a request on /api/suggestion, which returns a list of possible related keywords, ordered by relevance. Further information on how to use this api route can be found here
    2. When a search is started using a given input, a request on /api/catalogue will be performed, which delivers catalogue results, based on the given input. Further information on how to use this api route can be found here
    3. Each result is listed with the title, the abstract and the type of resource (metadata.metadata_type -> service|layer|featuretype|dataset|...)
    4. Each result has two buttons ons the right:
      1. Download opens the /atom-feed/<id> route with the appropriate id of the search result set --> access to the download client
      2. Feed opens the /service/metadata/<id>/atom-feed route with the appropriate id set --> access to the feed document
  2. Searches on the landing page shall not be handled using AJAX. Instead the start of a search will open the route /atom-feed?q=INPUT, which performs the internal API call and renders the result list
  3. The catalogue API provides pagination, which can be used for pagination on the minimal catalogue UI

Behol! My magnificent layout idea: image

Atom Feed Download Client

Let me explain the technical behaviour of this download interface:

How do we get data from web services? (simplified)

Basically, we use two important operations on WMS and WFS:

  1. WMS
    • wms-url?service=WMS&version=1.1.1&layers=layer1,layer2,...&bbox=90.0,100.0,-90,0,-100.0&srs=EPSG:4326&format=image/png&width=100&height=100
    • This means to provide a smaller subset of data for the user, we simply need to change the requested bbox! All other parameters can be the same!
  2. WFS
    • wfs-url?SERVICE=WFS&REQUEST=GetFeature&VERSION=1.1.0&TYPENAME=feature_name&BBOX=5539710.91589949745684862,2528178.23551842849701643,5602187.72767475247383118,2633820.21731885056942701,urn:ogc:def:crs:EPSG::31466
    • The same for WFS - we just have to iterate over all available features of this wfs, use the identifier for the parameter typename and the selected bbox to achieve our goal

Back to requirements

  1. The user must be aware whether he/she is logged in or not. This means in one of the upper corners, the user name should be visible (if logged in), otherwise guest should be visible.
    1. If a logged in user will perform the requests on internal services, the MrMap logic will automatically crop the resulting data to the allowed areas
  2. There must be a leaflet client, which zooms automatically to the extent of the web service
    • The leaflet must render the service extent as polygon to indicate for which area data is available
    • The leaflet client must include the leaflet-geoman plugin, just as it's used in the editor-access-geometry-form, so the user will be able to draw intersecting bounding boxes!
      • the plugin can be configured in a way, that only the square polygon tool is availabe. We will read the requested bbox from this polygon!
      • if multiple polygons are drawn by the user, each one leads to an own request, using the specific bbox of the polygon
  3. There must be a Download button, which starts the appropriate WMS/WFS request(s)
    • To avoid bots spamming us, we need a reCaptcha plugin here, which has to check whether the user is a real human being or not
    • This means, the drawn polygon's extents will be retrieved from the leaflet client, and sent via POST in one request to /atom-feed/<id>/download. How multiple polygons are transferred doesn't matter, as long as they can be retrieved correctly on the backend!
      • Please note: Implement this transfer in such a way, that we could extend it easily in the future. Let's say we want to give the user more options on which resolution the WMS images shall be retrieved. Than we need to match these further parameters to each bounding box without problems easily.
  4. The backend must download and zip the requested data
    1. A new pendingTask has to be created, which will give us the option to show a progress on the frontend
      1. Since we know how many requests we have to perform, we can calculate the step size for the progress bar and update the pendingTask record accordingly
    2. As stated beforehand, each bbox results in an own request. After each request, the pendingTask has to be updated.
    3. The result of each request has to be checked on sanity: WMS must provide some kind of image data (check using the PIL/Pillow package), WFS must deliver xml/gml like data
    4. The results must be stored on the file system inside a unique folder, where all requested data will be stored as files
    5. When all requests have been performed, the folder has to be zipped and named using a generated UUID4
    6. When the folder has been zipped, the original folder has to be removed
    7. A new record of AtomFeedDownload has to be created and persisted, where the generated uuid is set as zip_uuid and all other information accordingly
  5. The download url has to be constructed like /atom-feed/<id>/download/<zip_uuid>, where refers to the generated UUID4 of the zipped file
    1. The frontend must inform the user, after refreshing of the progressbar, that the task is finished and the data can be downloaded for x hours using this link
  6. The user must be able to call the /atom-feed/<id>/download/<zip_uuid> url even after closing the window
    1. The corresponding view method just takes the value and checks for an existing AtomFeedDownload record. If it exists, the corresponding zip file on the file system will be returned using a specific file response of django which supports streaming of larger files.
    2. A one-time periodicTask record with an intervalSchedule, which is set to x hours, must be created after AtomFeedDownload has been created. This periodicTask must delete the zipped file on the file system and the related database entry and remove itself afterwards, to keep things clean.

Behold! Another fabulous layout: image

jokiefer commented 3 years ago

take a look at the The syndication feed framework