Ouranosinc / raven

WPS services related to hydrological modeling
https://pavics-raven.readthedocs.io
MIT License
37 stars 12 forks source link

Test OWSLib PR allowing POST for WFS `getfeature` requests #297

Closed huard closed 3 years ago

huard commented 4 years ago

Description

We need to make WFS requests for info on many subregions. For a large number of regions, the GET requests exceeds the maximum size. See https://github.com/geopython/OWSLib/issues/439

Francis made a PR https://github.com/geopython/OWSLib/pull/706 to solve this, and we'd need a review of the PR, as well as confirmation that it solves our issue, illustrated in https://pavics-sdi.readthedocs.io/projects/raven/en/latest/notebooks/Subset_climate_data_over_watershed.html#Subsetting-a-gridded-climate-dataset https://github.com/Ouranosinc/raven/blob/master/docs/source/notebooks/gridded_data_subset.ipynb

The notebook sets the number of features to 2, but it should work with 100s of features.

TODO:

Zeitsperre commented 4 years ago

If I recall correctly, I had established that there was a problem via a very large request run in a Jupyter notebook. If I can find the issue where I raised this and the notebook/environment where I established the problem existed, it'll make things much easier to formally establish that the issue is fixed.

I'll have some time available to put some work into this as we near the Symposium.

Zeitsperre commented 4 years ago

@f-PLT

Thanks for your work on getting POST support into OWSlib. I'm pinging you on this thread as this problem is RAVEN-centric.

I seem to not be grasping the differences between GET and POST request formatting. Here's a snippet from a request formatting function in a PR I'm currently working towards:

    wfs = WebFeatureService(GEO_URL, version="1.1.0", timeout=30)
    if method.upper() == "GET":
        try:
            resp = wfs.getfeature(
                typename=layer,
                bbox=coordinates,
                srsname="urn:x-ogc:def:crs:EPSG:4326",
                method=method,
            )
        except Exception as e:
            raise Exception(e)
    elif method.upper() == "POST":
        if parse(ows_version) < parse("0.20.0"):
            raise NotImplementedError("POST requires OWSlib >= 0.20.0")
        try:
            resp = wfs.getfeature(typename=layer, bbox=coordinates, method=method)
        except Exception as e:
            raise Exception(e)
    else:
        raise NotImplementedError("Method %s not implemented" % method)

Does the `WebFeatureService().getfeature() method work for POST queries? I've combed your PR and haven't been able to find anything suggesting that it wouldn't work, but I seem to only be receiving this cryptic error when the method is POST:

owslib.util.ServiceException: Could not determine geoserver request from http request org.geoserver.platform.AdvancedDispatchFilter$AdvancedDispatchHttpRequest@577bc7f

Any advice?

f-PLT commented 4 years ago

Yes, I did the implementation for Post in 1.1.0 and 2.0.0.

I tested with a few public WFS services, with both versions, but I did have this kind of error when testing on Boreas (https://boreas.ouranos.ca/geoserver/wfs)

An ordinary Post request to that address with an xml filter worked well, but through OWSLib, that error always popped up with Post.

It has to do with this part : https://github.com/f-PLT/OWSLib/blob/wfs-getfeature-post/owslib/feature/__init__.py#L314

Basically, I took this part from the Get section of getfeature(), which takes the url from the GetCapabilities. In the case of Boreas, that address is http://boreas.ouranos.ca:80/geoserver/wfs which works for method=GET, but not with POST. I tried ordinary Post requests to that specific address and got the same kind of error as before (and yours). I gather it's an accessibility issue, as the unsecured address is not reachable outside of it's deployment network, and there's apparently no redirection to https. Might be related to how POST requests work compared to GET requests.

If I got rid of the url finding cited above and went with base_url = self.url, it would then work, but then lost the automatic redirections.

All this is from memory, I'll take an other look at it on Monday to confirm things, and I could also give you the python script I used while implementing this.

Assuming it's the same cause, of course.

huard commented 3 years ago
def get_feature(url, typename, features):
    """Return geometry from WFS server."""
    wfs = WebFeatureService(url, version='2.0.0')
    resp = wfs.getfeature([typename], featureid=features,
                          outputFormat='application/json',
                          method="Post")
    return json.loads(resp.read())

def test_get_feature():
    url = "https://pavics.ouranos.ca/geoserver/wfs"
    typename = "public:USGS_HydroBASINS_lake_na_lev12"
    feature_pat = "USGS_HydroBASINS_lake_na_lev12.{}"

    #features = [feature_pat.format(i) for i in range(67088, 67449)]
    features = [feature_pat.format(i) for i in range(67088, 67089)]
    get_feature(url, typename, features)

Fails with

Traceback (most recent call last):
  File "/home/david/src/flyingpigeon/tests/test_subset.py", line 10, in test_get_feature
    get_feature(url, typename, features)
  File "/home/david/src/flyingpigeon/flyingpigeon/subset_base.py", line 18, in get_feature
    method="Post")
  File "/home/david/src/OWSLib/owslib/feature/wfs200.py", line 324, in getfeature
    u = openURL(url, data, method, timeout=self.timeout, headers=self.headers, auth=self.auth)
  File "/home/david/src/OWSLib/owslib/util.py", line 230, in openURL
    raise ServiceException('\n'.join([t.strip() for t in serviceException.itertext() if t.strip()]))
owslib.util.ServiceException: Could not determine geoserver request from http request org.geoserver.platform.AdvancedDispatchFilter$AdvancedDispatchHttpRequest@3b972d1a
huard commented 3 years ago

@f-PLT Making this test function pass is our use case.

huard commented 3 years ago

See https://stackoverflow.com/questions/24952960/how-to-correctly-request-a-geoserver-wfs-via-post

huard commented 3 years ago

I see this header is already set in the code... stumped.

tlvu commented 3 years ago

Got this stack-trace server-side from docker logs geoserver that looks like the same error as in https://github.com/Ouranosinc/raven/issues/297#issuecomment-731432190:

23 Nov 15:31:12 ERROR [geoserver.ows] -                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
org.geoserver.platform.ServiceException: Could not determine geoserver request from http request org.geoserver.platform.AdvancedDispatchFilter$AdvancedDispatchHttpRequest@67147182
        at org.geoserver.ows.Dispatcher.dispatch(Dispatcher.java:622)
        at org.geoserver.ows.Dispatcher.handleRequestInternal(Dispatcher.java:258)
        at org.springframework.web.servlet.mvc.AbstractController.handleRequest(AbstractController.java:147)
        at org.springframework.web.servlet.mvc.SimpleControllerHandlerAdapter.handle(SimpleControllerHandlerAdapter.java:50)
        at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:959)
        at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:893)
        at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:968)
        at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:859)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:622)
        at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:844)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:729)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:292)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
        at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
        at org.apache.catalina.filters.CorsFilter.handleNonCORS(CorsFilter.java:436)
        at org.apache.catalina.filters.CorsFilter.doFilter(CorsFilter.java:177)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
        at org.geoserver.filters.ThreadLocalsCleanupFilter.doFilter(ThreadLocalsCleanupFilter.java:28)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
        at org.geoserver.filters.SpringDelegatingFilter$Chain.doFilter(SpringDelegatingFilter.java:75)
        at org.geoserver.wms.animate.AnimatorFilter.doFilter(AnimatorFilter.java:71)
        at org.geoserver.filters.SpringDelegatingFilter$Chain.doFilter(SpringDelegatingFilter.java:71)
        at org.geoserver.filters.SpringDelegatingFilter.doFilter(SpringDelegatingFilter.java:46)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
        at org.geoserver.platform.AdvancedDispatchFilter.doFilter(AdvancedDispatchFilter.java:50)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:316)
        at org.geoserver.security.filter.GeoServerCompositeFilter$NestedFilterChain.doFilter(GeoServerCompositeFilter.java:69)
        at org.springframework.security.web.access.intercept.FilterSecurityInterceptor.invoke(FilterSecurityInterceptor.java:126)
        at org.springframework.security.web.access.intercept.FilterSecurityInterceptor.doFilter(FilterSecurityInterceptor.java:90)
        at org.geoserver.security.filter.GeoServerCompositeFilter$NestedFilterChain.doFilter(GeoServerCompositeFilter.java:73)
        at org.geoserver.security.filter.GeoServerCompositeFilter.doFilter(GeoServerCompositeFilter.java:92)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:330)
        at org.geoserver.security.filter.GeoServerCompositeFilter$NestedFilterChain.doFilter(GeoServerCompositeFilter.java:69)
        at org.springframework.security.web.access.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:114)
        at org.geoserver.security.filter.GeoServerCompositeFilter$NestedFilterChain.doFilter(GeoServerCompositeFilter.java:73)
        at org.geoserver.security.filter.GeoServerCompositeFilter.doFilter(GeoServerCompositeFilter.java:92)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:330)
        at org.geoserver.security.filter.GeoServerAnonymousAuthenticationFilter.doFilter(GeoServerAnonymousAuthenticationFilter.java:54)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:330)
        at org.geoserver.security.filter.GeoServerCompositeFilter$NestedFilterChain.doFilter(GeoServerCompositeFilter.java:69)
        at org.springframework.security.web.authentication.www.BasicAuthenticationFilter.doFilterInternal(BasicAuthenticationFilter.java:158)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
        at org.geoserver.security.filter.GeoServerCompositeFilter$NestedFilterChain.doFilter(GeoServerCompositeFilter.java:73)
        at org.geoserver.security.filter.GeoServerCompositeFilter.doFilter(GeoServerCompositeFilter.java:92)
        at org.geoserver.security.filter.GeoServerBasicAuthenticationFilter.doFilter(GeoServerBasicAuthenticationFilter.java:84)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:330)
        at org.geoserver.security.filter.GeoServerCompositeFilter$NestedFilterChain.doFilter(GeoServerCompositeFilter.java:69)
        at org.springframework.security.web.context.SecurityContextPersistenceFilter.doFilter(SecurityContextPersistenceFilter.java:91)
        at org.geoserver.security.filter.GeoServerSecurityContextPersistenceFilter$1.doFilter(GeoServerSecurityContextPersistenceFilter.java:53)
        at org.geoserver.security.filter.GeoServerCompositeFilter$NestedFilterChain.doFilter(GeoServerCompositeFilter.java:73)
        at org.geoserver.security.filter.GeoServerCompositeFilter.doFilter(GeoServerCompositeFilter.java:92)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:330)
        at org.springframework.security.web.FilterChainProxy.doFilterInternal(FilterChainProxy.java:213)
        at org.springframework.security.web.FilterChainProxy.doFilter(FilterChainProxy.java:176)
        at org.geoserver.security.GeoServerSecurityFilterChainProxy.doFilter(GeoServerSecurityFilterChainProxy.java:152)
        at org.springframework.web.filter.DelegatingFilterProxy.invokeDelegate(DelegatingFilterProxy.java:346)
        at org.springframework.web.filter.DelegatingFilterProxy.doFilter(DelegatingFilterProxy.java:262)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
        at org.geoserver.filters.LoggingFilter.doFilter(LoggingFilter.java:87)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
        at org.geoserver.filters.GZIPFilter.doFilter(GZIPFilter.java:48)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
        at org.geoserver.filters.SessionDebugFilter.doFilter(SessionDebugFilter.java:48)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
        at org.geoserver.filters.FlushSafeFilter.doFilter(FlushSafeFilter.java:44)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
        at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:121)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:212)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:94)
        at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:504)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:141)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:79)
        at org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:620)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:88)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:509)
        at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1104)
        at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:684)
        at org.apache.tomcat.util.net.AprEndpoint$SocketWithOptionsProcessor.run(AprEndpoint.java:2445)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
        at java.lang.Thread.run(Thread.java:745)
Zeitsperre commented 3 years ago

I understand that OWSlib should be working with any instance of GeoServer that uses the proper standards, but I just want to raise a point here that our current instance is quite old (http://geoserver.org/release/2.9.3/). Do we know with certainty that this isn't a bug with POST in the build?

I'll try doing a quick look over the advancements in the past few years to see if this is covered anywhere. Otherwise, perhaps the three of us can chat sometime?

f-PLT commented 3 years ago

When I implemented the Post method, I ran accros this problem, but only while testing with Boreas (the only Ouranos Geoserver I tried).

I had no problems with other Geoservers, either external or locally deployed.

The only way I got it to work was to sacrifice this bit of code : https://github.com/f-PLT/OWSLib/blob/wfs-getfeature-post/owslib/feature/__init__.py#L312-L320

If I made it use the "outside" address https://pavics.ouranos.ca/geoserver/wfs, meaning only having this line instead https://github.com/f-PLT/OWSLib/blob/wfs-getfeature-post/owslib/feature/__init__.py#L320, it then worked with Boreas, but lost the ability to be redirected by what OWSLib got back from the GetCapabilities, which I gather could be a problem for other servers.

The url given to OWSLib is https://boreas.ouranos.ca/geoserver/wfs, but the GetCapabilities of the WFS is pointing to http://boreas.ouranos.ca:80/geoserver/wfs, which is restricted (and usually redirected?). It seems to the be the case with Pavics : https://pavics.ouranos.ca/geoserver/wfs?request=GetCapabilities&version=1.1.0

The same code is present for the Get method, but it does not interfere with the request.

I'm not familiar enough with how requests work and the finer points between Get vs Post requests, but my guess is it could be that the Post request does not handle redirection (or if it's possible to redirect it in the context of how OWSLib handles requests, which is outside of what I implemented).

It would also be consistent with this: https://stackoverflow.com/questions/31620987/redirecting-http-post-requests-to-https-post-requests#:~:text=You%20cannot%20redirect.,3. and https://softwareengineering.stackexchange.com/questions/99894/why-doesnt-http-have-post-redirect

From the above, to redirect a Post request seems like something we should avoid, so the ideal solution would be for the Geoserver's GetCapabilities to list the secured address.

However, I'm pretty sure we rely on Geoserver being unsecured locally so it can interact with the other services. Maybe new flag in getFeature() to force the use of the provided url, instead of searching for the provided one in the GetCapabilities? Not sure how good a design that would be though.

f-PLT commented 3 years ago

I can check into it a bit more with David B. this week, see if he has a better idea.

tlvu commented 3 years ago

I just want to raise a point here that our current instance is quite old (http://geoserver.org/release/2.9.3/)

Maybe we can take this opportunity to address https://github.com/Ouranosinc/pavics-sdi/issues/183? Since right now we are unable to test Geoserver on other PAVICS instance than Boreas meaning if we perform this upgrade, where are we going to test the upgrade?

tlvu commented 3 years ago

The url given to OWSLib is https://boreas.ouranos.ca/geoserver/wfs, but the GetCapabilities of the WFS is pointing to http://boreas.ouranos.ca:80/geoserver/wfs, which is restricted (and usually redirected?). It seems to the be the case with Pavics : https://pavics.ouranos.ca/geoserver/wfs?request=GetCapabilities&version=1.1.0

Looks like https://github.com/bird-house/birdhouse-deploy/issues/14 a Twitcher/Magpie bug or new feature needed.

tlvu commented 3 years ago

It would also be consistent with this: https://stackoverflow.com/questions/31620987/redirecting-http-post-requests-to-https-post-requests#:~:text=You%20cannot%20redirect.,3. and https://softwareengineering.stackexchange.com/questions/99894/why-doesnt-http-have-post-redirect

From the above, to redirect a Post request seems like something we should avoid, so the ideal solution would be for the Geoserver's GetCapabilities to list the secured address.

I've read those 2 links and agree that redirecting POST is probably a bad idea. The summary is POST is usually not an idempotent action (ordering pizza, casting a vote, ...) so should not be repeated (a redirect is like repeating the POST).

However, I'm pretty sure we rely on Geoserver being unsecured locally so it can interact with the other services.

I assume other services can be reconfigured to use the secure httpS connection. httpS is pretty much the norm now.

Maybe new flag in getFeature() to force the use of the provided url, instead of searching for the provided one in the GetCapabilities? Not sure how good a design that would be though.

If the flag is optional, then why not? You are not introducing any breaking changes and not forcing it on users either.

f-PLT commented 3 years ago

I assume other services can be reconfigured to use the secure httpS connection. httpS is pretty much the norm now.

I agree 100%

If the flag is optional, then why not? You are not introducing any breaking changes and not forcing it on users either.

Well, short term, yes, I agree. It's also a really quick fix.

Just to be sure we're talking about the same thing, the flag would need to be in OWSLib's getFeature function

From a design perspective, I'm wary of simply adding a flag to solve this problem.

It's not a problem internal to OWSLib. The Post request, the HTTP protocol and getCapabilities (albeit with an url that is being redirected to httpS) are doing what they're suppose to do. The problem is (most probably) on our end with the way we've configured Geoserver.

Introducing a flag argument in a public module for a very specific use case like this seems, well, out of place? I wouldn't like to see, down the line, 10 flags or more for different things unrelated to the core of the WFS standard.

Just to be clear, I'm simply voicing my concerns here. At the end of the day, we do have time constraints and I'll implement what I'm told to implement.

f-PLT commented 3 years ago

I'll also be double-checking again tomorrow to make sure a standard post request works with the Resquests library, and also without the url fetching done in OWSLib.

huard commented 3 years ago

I agree with @f-PLT on not using owslib to fix issues with our own infrastructure.

Zeitsperre commented 3 years ago

@f-PLT

@huard and I had checked over your examples the other week, and everything seemed fine when using the Requests library. For the purposes of Raven, I could potentially patch in a workaround based on your code if the priority is getting Raven up and running sooner.

Maybe we can take this opportunity to address Ouranosinc/pavics-sdi#183? Since right now we are unable to test Geoserver on other PAVICS instance than Boreas meaning if we perform this upgrade, where are we going to test the upgrade?

@tlvu

I can throw some time into that issue this week if it's in the interest of the greater project development. For the HTTPS configurations, not quite my tempo. I know that if we can modernize the version, the data management side of the GeoServer would be much simpler for additional instances. This is a bit out of scope in this thread but let me know if there are other aspects of GeoServer that could use a helping hand.

huard commented 3 years ago

There is no need for a quick fix on this. Questions:

Zeitsperre commented 3 years ago

There is no need for a quick fix on this.

Good to know. If there's something else that is within Raven that needs help with, ping away. Shall I focus on https://github.com/Ouranosinc/pavics-sdi/issues/183 then ?

tlvu commented 3 years ago

Hitting directly geoserver http://pavics.ouranos.ca:8087/geoserver/wfs?request=GetCapabilities&version=1.1.0 (only works inside Ouranos firewall), I am getting proper

<ows:Operation name="GetCapabilities">
<ows:DCP>
<ows:HTTP>
<ows:Get xlink:href="http://pavics.ouranos.ca:8087/geoserver/wfs"/>
<ows:Post xlink:href="http://pavics.ouranos.ca:8087/geoserver/wfs"/>
</ows:HTTP>
</ows:DCP>

Not sure why behind Nginx and Twitcher/Magpie https://pavics.ouranos.ca/geoserver/wfs?request=GetCapabilities&version=1.1.0, we end up with the wrong http protocol and the wrong port

<ows:Operation name="GetCapabilities">
<ows:DCP>
<ows:HTTP>
<ows:Get xlink:href="http://pavics.ouranos.ca:80/geoserver/wfs"/>
<ows:Post xlink:href="http://pavics.ouranos.ca:80/geoserver/wfs"/>
</ows:HTTP>
</ows:DCP>

Will attempt to put Geoserver directly behind Nginx see if the error persist. If yes then it's Twitcher/Magpie and most likely relate to https://github.com/bird-house/birdhouse-deploy/issues/14

tlvu commented 3 years ago

Digging further, geoserver is actually not behind Twitcher/Magpie, only behind Nginx. It's the same Nginx for everything else and the usual headers for performing front proxy is there. Will dig into why geoserver is miss configured.

As for why geoserver data is not protected behind Twitcher/Magpie, it is a little weird since Magpie already have the config to protect it https://github.com/bird-house/birdhouse-deploy/blob/cc4ca20665e5eeb3d3ef51e1588521f4b42bef4d/birdhouse/config/magpie/providers.cfg.template#L74-L88 This predates me starting at Ouranos so should we enable this protection or it's not protected by Magpie for a particular reason? Someone remember?

Zeitsperre commented 3 years ago

Does placing it behind Magpie complicate testing for CI ensembles/notebooks? Will users/services need authentication in order to perform WMS/WFS requests if GeoServer is behind it? I can't recall if I was ever privy to this discussion (it sounds like something my predecessor would have covered).

f-PLT commented 3 years ago

There is no need for a quick fix on this. Questions:

  • Does updating geoserver fix this ?
  • If not, where is the relevant configuration for this ?

For this issue, I don't think updating geoserver will do much. However, I did try Kartoza's latest geoserver image and the setup/configuration process has been streamlined. (env file and no need for JAVA_OPTS in the docker-compose)

If we need to add the plugin for NetCDF to Geoserver, it could be a good opportunity to update at the same time. It's not a priority right now though.

As for the configuration, I found this, which could be helpful : https://docs.geoserver.org/2.9.3/user/configuration/globalsettings.html#proxy-base-url For version 2.9.3, I think it would have to be managed by the Tomcat properties, like the the max heap size https://github.com/kartoza/docker-geoserver/tree/2.9.4#setting-tomcat-properties

f-PLT commented 3 years ago

As for why geoserver data is not protected behind Twitcher/Magpie, it is a little weird since Magpie already have the config to protect it https://github.com/bird-house/birdhouse-deploy/blob/cc4ca20665e5eeb3d3ef51e1588521f4b42bef4d/birdhouse/config/magpie/providers.cfg.template#L74-L88 This predates me starting at Ouranos so should we enable this protection or it's not protected by Magpie for a particular reason? Someone remember?

It's also before I started working for CRIM, I'll bring it up at our next standup.

tlvu commented 3 years ago

Update on this.

As for the configuration, I found this, which could be helpful : https://docs.geoserver.org/2.9.3/user/configuration/globalsettings.html#proxy-base-url

This works but it's a manual UI change. I'd rather have everything automated. This would be the fall back if nothing else work.

For version 2.9.3, I think it would have to be managed by the Tomcat properties, like the the max heap size https://github.com/kartoza/docker-geoserver/tree/2.9.4#setting-tomcat-properties

This is not working. Not sure if I implemented it wrong or our geoserver do not support it (we have a custom docker build "inspired" from kartoza build).

My attempt: https://github.com/bird-house/birdhouse-deploy/commit/fddbd1efc50c686c471546178fe32c8a5fa57aae (not found too many documentation about configuring tomcat properties, I might get this wrong).

Our custom geoserver docker build: https://github.com/bird-house/birdhouse-deploy/blob/cc4ca20665e5eeb3d3ef51e1588521f4b42bef4d/birdhouse/docker/geoserver/Dockerfile

I'll spend the rest of the day finding alternative. Else are we okay with the manual change? I can do the manual change right now on Boreas for you guys to continue testing.

tlvu commented 3 years ago

Does placing it behind Magpie complicate testing for CI ensembles/notebooks?

Should not, Thredds is being Magpie and we have e2e notebook test using data from Thredds. Only 1 notebook test Thredds with authentication all the rest are accessing Thredds publicly.

Will users/services need authentication in order to perform WMS/WFS requests if GeoServer is behind it?

I guess yes, if some path is protected. Else, like the current Thredds, everything is currently public so no auth required. In the current state (not behind Magpie) everything is public as well.

tlvu commented 3 years ago

I can do the manual change right now on Boreas for you guys to continue testing.

I just did the manual change.

https://pavics.ouranos.ca/geoserver/wfs?request=GetCapabilities&version=1.1.0 returns

<ows:Operation name="GetCapabilities">
<ows:DCP>
<ows:HTTP>
<ows:Get xlink:href="https://pavics.ouranos.ca/geoserver/wfs"/>
<ows:Post xlink:href="https://pavics.ouranos.ca/geoserver/wfs"/>
</ows:HTTP>
</ows:DCP>

This is what we want?

In the mean time, I'll keep looking for an automated way to set it.

Zeitsperre commented 3 years ago

That looks right. I can try my hand at testing the POST capabilities using @f-PLT's scripts to get confirmation!

tlvu commented 3 years ago

This just pass for me

https://github.com/bird-house/flyingpigeon/blob/7b7a5ce1dda2cde93db34667368ac224a75d58cc/tests/test_subset.py#L1-L10 (branch test_post)

Zeitsperre commented 3 years ago

Seems to now be working perfectly for me as well:

url = self.urls["pavics"]
version = "1.1.0"
wfs = WebFeatureService(url, version)

result = wfs.getfeature(typename="public:CanVec_WaterBodies",
                            maxfeatures=5,
                            outputFormat="application/json",
                            method="post",
                            )

Gives me:

{"type":"FeatureCollection","totalFeatures":128796,"features":
[{"type":"Feature","id":"CanVec_WaterBodies.1","geometry":null,"properties":{"bbox":[-125.1331,64.7883,-117.4586,67.0415]}},
{"type":"Feature","id":"CanVec_WaterBodies.2","geometry":null,"properties":{"bbox":[-90.4187,46.4087,-84.6029,48.3055]}},
{"type":"Feature","id":"CanVec_WaterBodies.3","geometry":null,"properties":{"bbox":[-116.9824,60.8071,-108.9008,62.9534]}},
{"type":"Feature","id":"CanVec_WaterBodies.4","geometry":null,"properties":{"bbox":[-87.6073,41.7607,-84.7217,46.095]}},
{"type":"Feature","id":"CanVec_WaterBodies.5","geometry":null,"properties":{"bbox":[-89.58,46.4996,-84.355,49.0172]}}],
"crs":null}

Edit: Interesting that I don't receive the geometry though... Might be more to this... Edit2: WFS @v2.0.0 gives me the geometry. The fix seems to be exactly what we need!

huard commented 3 years ago

Excellent !

Trevor, we'll likely need this feature for the raven distributed model configuration we spoke about today, so this is great.

f-PLT commented 3 years ago

Edit: Interesting that I don't receive the geometry though... Might be more to this... Edit2: WFS @v2.0.0 gives me the geometry. The fix seems to be exactly what we need!

By adding propertyname=None in a wfs 1.1.0 request, you'll get it to work.

This is an unintended behavior, caused by a preset here : https://github.com/geopython/OWSLib/blob/master/owslib/feature/wfs110.py#L223

It looks voluntary, and works for a Get request, but for 2.0.0, it's defaulted to None : https://github.com/geopython/OWSLib/blob/master/owslib/feature/wfs200.py#L227

I'll see if I can fix it without breaking something else!

f-PLT commented 3 years ago

Well... turns out it looks like there's a problem with propertyname even with Get. I'll investigate and create an issue on OWSLib (or might as well just address it myself since my hands are in the code). Working as intentend, just sum small quirks between 1.1.0 and 2.0.0

I'll fix my part of the code so the default propertyname="*" in 1.1.0 does not interfere with requests.

tlvu commented 3 years ago

I got the automated way working but GetCapabilities returns https://pavics.ouranos.ca:443/geoserver/wfs (there's 443 in there).

Pretty sure it's still working, my previous https://github.com/bird-house/flyingpigeon/blob/7b7a5ce1dda2cde93db34667368ac224a75d58cc/tests/test_subset.py#L1-L10 still pass.

So just for your info if you notice something weird. It's live on Boreas right now for testing ... sorry I do not have another server with proper test data.