swisstopo / topo-satromo

Erdbeobachtungs-Satellitendaten fürs Trockenheitsmonitoring (SATROMO)
BSD 3-Clause "New" or "Revised" License
14 stars 3 forks source link

Automated Extraction of Raster Statistics and Data Availability Calculation for Shapefile Polygons #72

Closed davidoesch closed 5 months ago

davidoesch commented 7 months ago

Feature Request: Automated Extraction of Raster Statistics and Data Availability Calculation for Shapefile Polygons

Is your feature request related to a problem? Please describe. Currently, there's a need to extract raster statistics for each polygon in a shapefile manually, which can be time-consuming and error-prone. Automating this process would streamline the workflow and improve efficiency.

Describe the solution you'd like A py script that automates the extraction of raster statistics for each polygon in a shapefile. The script should:

  1. Read the input shapefile containing polygon geometries.
  2. Access a raster file containing the data of interest.
  3. For each polygon in the shapefile:
    • Mask the raster using the polygon geometry.
    • Calculate statistics (e.g., mean, min, max) of the raster values within the polygon.
    • Determine the percentage of data availability within the polygon area.

Code:

GeoJSON

  1. Export the results to CSV, GeoJSON, and GeoParquet formats.
  2. Include options for setting parameters such as the raster file URL, missing data values, date in ISO 8601 format, and output file name.

Describe alternatives you've considered Currently, the process involves manual extraction of raster statistics using GIS software or scripting languages like Python with libraries such as geopandas and rasterio. However, automating this process would eliminate the need for manual intervention and reduce the likelihood of errors.

Additional context

davidoesch commented 7 months ago

Test result with dummy data based on https://github.com/swisstopo/topo-satromo/blob/dev/main_functions/main_extract_warnregions.py

the attributes names for region and region number are from the test Shapefile . To be discussed Get the files here

test_geojson_parquet.zip

Please test and comment

p1d1d1 commented 7 months ago

@davidoesch you don't need to argument for GeoJSON in WGS84 since this is the standard CRS. You'd have to argument and reference section 4 of the RFC in case you'd use a different CRS.

Files seem ok to me. Question: what is the CRS of the raster data you provide via STAC?

p1d1d1 commented 7 months ago

PS: Not sure your date time values are ISO8601 valid. Shouldn't they be in the form: 2024-03-07T23:59:59?

davidoesch commented 7 months ago

yes, 2024-03-07T23:59:59 will fix that ( the small t is due to the fact that I copy pasted it from the item)

Rdataflow commented 7 months ago

@davidoesch you may cut the precision of WGS84 coordinates to a reasonable size

 [ 8.269, 47.576], [ 8.274, 47.539]

instead of

 [ 8.269951333839606, 47.57603677523322 ], [ 8.274097915977105, 47.539072390916687 ]
davidoesch commented 7 months ago

@davidoesch you don't need to argument for GeoJSON in WGS84 since this is the standard CRS. You'd have to argument and reference section 4 of the RFC in case you'd use a different CRS.

Files seem ok to me. Question: what is the CRS of the raster data you provide via STAC?

Raster CRS is 2056

Rdataflow commented 7 months ago

@davidoesch don't forget about the timezone. Shouldn't datetime ISO8601 be in the form:

davidoesch commented 7 months ago

@davidoesch don't forget about the timezone. Shouldn't datetime ISO8601 be in the form:

  • 2024-03-07T23:59:59Z (UTC) or
  • 2024-03-07T23:59:59+01:00 (MEZ) or
  • 2024-03-07T23:59:59+02:00 (MESZ)

according to latest ISO 8601 2022 version An amendment was published in October 2022 featuring minor technical clarifications and attempts to remove ambiguities in definitions. The most significant change, however, was the reintroduction of the "24:00:00" format to refer to the instant at the end of a calendar day.

davidoesch commented 7 months ago

@p1d1d1

Adding a global_date instead of one date per feature mit

"type": "FeatureCollection",
"global_date": "2024-03-07T23:59:59",
"crs": { "type": "name", "properties": { "name": "urn:ogc:def:crs:OGC:1.3:CRS84" } },
"features": [

Ist gemäss https://www.itb.ec.europa.eu/json/geojson/upload dem standard entsprechend

-> your take? ok so aus der Sicht Geostandards?

Beispiel file

ch.swisstopo.swisseo_vhi_v100_2024-03-07t235959_forest_region-39_V3.zip

p1d1d1 commented 7 months ago

@davidoesch:

p1d1d1 commented 7 months ago

@davidoesch and if you could use camel case for the properties name would be nice! so e.g.

"properties": { "regionNr": 1, "name": "östlicher Jura", "vhiMean": 75, "availabilityPercentage": 100.0

davidoesch commented 6 months ago

Added a new version geojson and parquet

Fixed:

Files to test: ch.swisstopo.swisseo_vhi_v100_2024-03-07t240000_forest_region-39_v2.zip

GeoJSON PARQUET
QGIS 3.30.2 ok ok
ArcGIS Pro 3.2 ok to be tested
ArcMap via arcpy.conversion.JSONToFeatures kein support
davidoesch commented 6 months ago

Parquet