Ouranosinc / pavics-sdi

Power Analytics and Visualization for Climate Science - Spatial Data Infrastructure
https://pavics-sdi.readthedocs.io
6 stars 2 forks source link

Not testing Geoserver on other PAVICS deployments than the production host #183

Open tlvu opened 3 years ago

tlvu commented 3 years ago

Notebook https://github.com/Ouranosinc/pavics-sdi/blob/400c0f920b307fffc984b4b97c7e8d12c371b756/docs/source/notebooks/WFS_example.ipynb hardcode http://boreas.ouranos.ca/geoserver/wfs means the hostname will not be replaced by the test suite at https://github.com/Ouranosinc/PAVICS-e2e-workflow-tests to be able to target other PAVICS deployments. This means the Geoserver on other PAVICS deployment are not tested.

However allowing targetting other Geoserver means we have to provide test data matching the needs of the notebook WFS_example.ipynb. So this means

FYI @MatProv, @Zeitsperre, @huard

tlvu commented 3 years ago

Related discussion https://github.com/bird-house/birdhouse-deploy/issues/6#issuecomment-719584374

huard commented 3 years ago

I think there are two distinct issues that we should differentiate:

  1. Testing a new stand-alone installation
  2. Demonstrating PAVICS@ouranos

The notebooks are designed for 2. In this case, it might be better to have a synthetic dataset and associated test for 1 rather than trying to merge 1 and 2 together in a notebook.

tlvu commented 3 years ago

I think there are two distinct issues that we should differentiate:

1. Testing a new stand-alone installation

2. Demonstrating PAVICS@ouranos

The notebooks are designed for 2. In this case, it might be better to have a synthetic dataset and associated test for 1 rather than trying to merge 1 and 2 together in a notebook.

Agreed. The advantage of mixing both together is we can test that our demo of PAVICS@ouranos still works, and also saving time from writing new test cases.

But sometime it might not be worth it.

I am open for other suggestions on how to test the Geoserver end-to-end.

fmigneault commented 1 year ago

@tlvu

However allowing targetting other Geoserver means we have to provide test data matching the needs of the notebook WFS_example.ipynb. So this means

Can you extract the public:canada_admin_boundaries layer referenced in the notebook to add it to https://github.com/bird-house/birdhouse-deploy/pull/381?

This way, we can also validate that everything works when https://github.com/bird-house/birdhouse-deploy/pull/348 is ready as well.

tlvu commented 1 year ago

Can you extract the public:canada_admin_boundaries layer referenced in the notebook to add it to bird-house/birdhouse-deploy#381?

Asking our in-house Geoserver @Zeitsperre power user, can the request from @fmigneault above be done?

The issues that need to be sorted out are:

The reason being all staging and test servers of PAVICS should be able to load this Geoserver data unattended.

fmigneault commented 1 year ago

Ensure this dataset can be distributed publicly legally

Note that I'm not fixed on that specific dataset if there is an issue. Anything that can be swapped for the test notebook is fine. Though, if this one is not allowed, there would be an actual issue because this layer is already available publicly! https://pavics.ouranos.ca/geoserver/ows?service=WFS&acceptversions=2.0.0&request=GetFeature&layers=public:canada_admin_boundaries&typeName=public:canada_admin_boundaries&bbox=-74.5,45.2,-73,46

  • Provide a mechanism to distribute this dataset and load it into Geoserver without using Geoserver WebUI for test automation
  • How big it this dataset, where to host it?

Can be a snapshot for test purposes. It does not need to be updated automatically. It can also be a subset that match the bbox area of the test notebook if the original is big. This test sample would in birdhouse-deploy as an optional-component for tests. That component could either place it in the right location in the stack and mounted in GeoServer directly (if that is sufficient/possible?), or do a one-shot docker run to post the features via GeoServer API.

Zeitsperre commented 1 year ago

Hi all, please excuse the radio silence, I was getting re-certified for first aid this week.

Can you extract the public:canada_admin_boundaries layer referenced in the notebook to add it to bird-house/birdhouse-deploy#381?

This way, we can also validate that everything works when bird-house/birdhouse-deploy#348 is ready as well.

Absolutely, the dataset can be found via the Canada Census Boundaries geometries. It is publicly available data under the Open Canada License. No legal distribution issues. I'll find a copy and convert it to GeoPackage or GeoJSON (anything but Shapefile).

In order to load the file into GeoServer, this project would likely be one of the better candidates (geoserver-rest). It's much more mature than it was just a few years ago. Once the data is locally available to the server/service, the library has dataset publishing functions.

Will confirm the size and report back.

Edit: The compressed dataset is around 170 MB, so we would need to host it somewhere (or fetch it on deployment? https://www12.statcan.gc.ca/census-recensement/2021/geo/sip-pis/boundary-limites/files-fichiers/lpr_000b21g_e.zip). Interestingly, StatCan offers an ESRI REST service for the layers now: https://geo.statcan.gc.ca/geo_wa/rest/services/2021/Cartographic_boundary_files/MapServer.

fmigneault commented 1 year ago

The notebook only seems to query WFS in GeoJSON format and display it on a map. If the original example is 170MB, I think it is better to find a new one and simply update the notebook. The test layer could be anything much smaller than this.

Zeitsperre commented 1 year ago

@fmigneault

In that case, literally anything in a geospatial format from here would be fine: https://www.donneesquebec.ca/recherche/dataset. Lots of options and I can add anything we'd like to the production GeoServer.

fmigneault commented 1 year ago

Ok. Random selection: https://www.donneesquebec.ca/recherche/dataset/vmtl-lieux-batiments-vocation-publique

fmigneault commented 1 year ago

Other alternative is to POST data on the Geoserver REST endpoint before querying it in https://github.com/Ouranosinc/pavics-sdi/blob/master/docs/source/notebooks/WFS_example.ipynb Example notebook doing it: https://app.reviewnb.com/Ouranosinc/PAVICS-e2e-workflow-tests/pull/125/

tlvu commented 11 months ago

@fmigneault

In that case, literally anything in a geospatial format from here would be fine: https://www.donneesquebec.ca/recherche/dataset. Lots of options and I can add anything we'd like to the production GeoServer.

Just to be clear, this is not about adding to the production GeoServer. This is about automating the data provisioning in a fresh and empty GeoServer so all testing instance of PAVICS will have the matching data for the test notebook to run.

fmigneault commented 11 months ago

Yes of course. The data is necessary only for the test.