alexander-petkov / wfas

A placeholder for the WFAS project.
5 stars 1 forks source link

Data availability web page? #65

Open wmjolly opened 8 months ago

wmjolly commented 8 months ago

Could we make an API or webpage that queries the GetCapabilities of key layers on GeoServer and provide a summary of the spatial and temporal extents of each dataset?

It would be great for monitoring and troubleshooting but also helpful in the future when we share the resource more widely.....

Example Dataset: GFS Spatial Extent: -180 to 180 Lon ,90 to -90 Lat Temporal Extent: 15 Dec 2023 to 10 Feb 2024

alexander-petkov commented 7 months ago

A guide to build a Wicket extension: https://docs.geoserver.org/stable/en/developer/programming-guide/web-ui/overview.html https://docs.geoserver.org/maintain/en/developer/programming-guide/web-ui/implementing.html https://docs.geoserver.org/stable/en/developer/programming-guide/wicket-pages/index.html

alexander-petkov commented 5 months ago

Explore querying datasets via setting attributes: image

https://docs.geoserver.org/main/en/user/data/webadmin/workspaces.html

The other way I can think of is by querying the database.

EDIT: Also explore gathering info and identifying problems via #67

alexander-petkov commented 5 months ago

What should a status page reveal about a dataset? Some ideas:

  1. Last updated
  2. Number of granules/ temporal extent.
  3. Identify gaps in data. That leads to the question--what is a gap? There are 1,3, and 6-hourly datasets, and the definition of a gap will differ between them.
  4. Spatial coverage
  5. Retention period
  6. Number of granules: 6.1 Expected granules 6.2 Actual count
  7. Update frequency
  8. Last update
  9. Detected problems (yes/no or green/red icon)

| Abbreviation | Name | Description | Workspace | Number of granules | Spatial Coverage |

| ------------- | ------------- | ------------- | ------------- | ------------- | ------------- |

create table dataset_index(abbreviation varchar, name varchar, num_granules integer,  spatial_coverage varchar);
alexander-petkov commented 5 months ago

So here is an idea:

  1. For each data archive (workspace), have a designated "special" metadata layer, which gets updated via rest upon data retrieval.
  2. Say, it could be JSON format which gets updated. For example: metadata.json
  3. Geometry is not important, but it could be bounding envelope for example.
  4. It could hold number of granules (or features) removed/added for each layer, and consequently detected problems.
alexander-petkov commented 4 months ago

Checking problems with a dataset:

  1. Number of granules/rasters
  2. Number of entries in the database
  3. TIme gaps in data.