MetPX / sarracenia

https://MetPX.github.io/sarracenia
GNU General Public License v2.0
45 stars 22 forks source link

GeoJSON filter after_accept plugin #1043

Closed reidsunderland closed 2 months ago

reidsunderland commented 5 months ago

We already support including GeoJSON in messages.

One poll plugin I wrote adds GeoJSON information, because I noticed the API provided it, and I thought it would be interesting to use it. https://github.com/MetPX/sarracenia/blob/development/sarracenia/flowcb/poll/eumetsat.py

Now we need an after_accept plugin that can filter based on the GeoJSON in the message.

i.e. if a region defined in a config file overlaps with the region in the message, accept it. Otherwise, reject the message and don't bother downloading the file.

This after_accept plugin could be used either in a poll (to prevent messages from being posted) or in a sarra/subscribe (to filter messages before downloading).

There are multiple non-Sarracenia polls in use at CIS that do region filtering and only download files that cover the region they want. Implementing this plugin would be one step in the process of replacing these custom polls with sr3.

petersilva commented 5 months ago

That´s a great idea. My guess, from googline a bit, is that the fully general answer is to use something like geopandas.

There are pointers to shapely also... but that seems more geometry and less geography focused.

perhaps a lighter, simpler one is turfpy:

but not sure how active it is.

I´m wondering how complex the region specifications are from ice... if they are just 2D rectangles or circles, we can likely do something manually, easily. If we just want to compare an arbitrary geometry specification, then I think we will quickly end up using something like GeoPandas or turfpy.

It´s a big library, big dependency... fine if we need it, but heavy if our needs are simpler.

petersilva commented 5 months ago

how to the CIS polls do the region filtering? The client is likely more versed in python + GIS than we are. We should at least study what they have done.

petersilva commented 5 months ago

I could imagine something dead simple like: https://turfpy.readthedocs.io/en/latest/measurements/distance.html

from turfpy import measurement
from geojson import Point, Feature
interested_in = Feature(geometry=Point((-75.343, 39.984)))
product_covers = Feature(geometry=Point((-75.534, 39.123)))

if measurement.distance(interested_in,product_covers) == 0:
     accept
else:
    reject

That might be enough for a proof of concept.

reidsunderland commented 5 months ago

It seems like CIS uses GDAL.

petersilva commented 5 months ago

GDAL is wrapper on C++ ... this brings all manner of deployment pain... I think life is easier if we use turfpy... pure python... but we can still study their examples to see what kind of operations they use in GDAL to select. see if it's easy to map.

petersilva commented 4 months ago

the field in the message to interpret is geometry it should have a value, like the field of the same name in a GeoJSON file.

petersilva commented 4 months ago

in a config file:


geometry  { "bbox": [-10.0, -10.0, 10.0, 10.0] }

options.add_option( type=str, ... parse with the json module.

or type=list, and join all string values before you parse...


geometry  "type": "Polygon",
geometry           "coordinates": [
geometry               [
geometry                   [-10.0, -10.0],
geometry                   [10.0, -10.0],
geometry                   [10.0, 10.0],
geometry                   [-10.0, -10.0]
geometry               ]

but that's quite ugly...