Open someoneLookingForHelp opened 5 months ago
Thank you for reporting. When using OGC Production environment with the REST Interface, HTTP status code 500 is returned indeed: https://cite.ogc.org/teamengine/rest/suites/wfs20/run?wfs=https%3A%2F%2Fschullandschaft.brandenburg.de%2Fedugis%2Fwfs%2Fus-govserv-education%2Fkitas%3FACCEPTVERSIONS%3D2.0.0%26request%3DGetCapabilities%26service%3DWFS%26VERSION%3D2.0.0 We will do further investigation.
I also move the issue to the ets-wfs20 tracker as this behavior is most likely connected to the test suite itself.
@someoneLookingForHelp I have run the tests several times with the referenced service. I got different results from connection time outs over ssh handshake issues to successful executions of the tests. Could you [edit: check] your log files, whether the service runs stable?
@bpross-52n,
today I ran the tests again and looked at the logs at the same time. The response from the testsuite is: Error executing test suite (wfs20): Error message: Failed to connect to resource located at https://schullandschaft.brandenburg.de/edugis/wfs/us-govserv-education/kitas?ACCEPTVERSIONS=2.0.0&request=GetCapabilities&service=WFS&VERSION=2.0.0
In our log files, the requests are returned with 200 (...[06/Mar/2024:09:19:57 +0100] "GET https://schullandschaft.brandenburg.de/edugis/wfs/us-govserv-education/kitas?ACCEPTVERSIONS=2.0.0&request=GetCapabilities&service=WFS&VERSION=2.0.0 HTTP/1.1" 200...).
I can't find any SSL handshake problems yet, because I see the requests on our server. Nevertheless, a request is still being made to the service provider for the SSL proxy server. But I doubt we will get a helpful answer there.
After reading your first paragraph, might it be possible that your service is not available for a short period of time leading to a failure inside the test suite?
During my test, the server was accessible the entire time. But it is a productive system where other users are also active and make requests to the server, which can lead to slight delays. Which connection timeouts do you get? How long does it take for the test suite before it throws an error?
I started a test run and right now the service seems to be not responding anymore. Some tests did run, it stucks after sending the following POST request:
<?xml version="1.0" encoding="UTF-8"?>
<wfs:GetFeature xmlns:wfs="http://www.opengis.net/wfs/2.0" count="10" service="WFS" version="2.0.0">
<wfs:Query xmlns:ns60="http://inspire.ec.europa.eu/schemas/us-govserv/4.0" typeNames="ns60:GovernmentalService"/>
</wfs:GetFeature>
I got a response:
Interesting... I get this with Postman:
The ETS test run got a time out in the meantime. I also cannot access the Capabilities using a browser. Is there maybe a restriction in place or some kind of protection against ddos attacks?
I got also a response with postman:
Yes, there is protection against DDOS attacks. How many requests do you make with the test suite?
I counted 83 requests with two other WFS services. Depending on the available data this number can also increase. And the requests are send within a short period of time, so I guess the protection mechanisms are activated...
Interesting... I get this with Postman:
The ETS test run got a time out in the meantime. I also cannot access the Capabilities using a browser. Is there maybe a restriction in place or some kind of protection against ddos attacks?
@bpross-52n: ehm this error means your request wasn't even send out, because "post https" is an invalid protocol, could you check again without the "POST" in the URL field?
@dudleyperkins Oops, that is correct. Getting a timeout now as well in Postman.
Are you using a local postman from within your network or via webservice?
I'm intentionally using the postman webservice, just to eliminate any problems that could arise on the way out, and i probably hit the button 100 times now and never got any other response than "200 OK". Times vary between 500-800ms, but that's okay considering that TCP and SSL Handshakes are cached in between and only happen every third try.
Yes, I am using a local instance of postman. We run into this issue also with the Teamengine running on OGC servers. However, I can confirm that it works reliably with the postman webservice. Maybe they are whitelisted? What do you think could cause the issue?
I managed to [edit] reproduce this now also with the postman webservice. I am pretty sure now this is a protective measure on server side.
The server was booted once today, you may have tested at the same time. What did your request look like? Same request as before? Can you reproduce the timeout with the postman webservice? There are no plans to restart the server today.
I made a collection of the requests that are sent by the Teamengine testsuite. Running this from the postman webservice will lead to timeouts. Do your postman webservice requests still com through? I could share the collection then probably, so you can run it from your postman webservice workspace.
We see the postman requests but no timeout. Please share your collection of requests, so I will test it from my postman webservice workspace.
Meanwhile we received feedback from our service provider, there is no white- only blacklisting which, if active, would mean the ip is completely blocked. No request would even reach our servers. So as we still see requests coming in, even while the test is running, there must be something else going wrong.
I just think it's strange, that we run our own monitoring and are also beeing monitored by other Websites (e.g. https://directory.spatineo.com/service/156335/) reaching nearly 100% Uptime for normal requests, while every request you perform out of your own network with a local postman ends up in a timeout. While the only thing we see on our side are "working" requests and replies with either 200 or 400. From a network/server perspective this doesn't make sense and the problem should have it's source on an upper layer.
We see the postman requests but no timeout. Please share your collection of requests, so I will test it from my postman webservice workspace.
Could you share your email or postman account name? You can use my email address, if you like: b.pross @52north.org
Meanwhile we received feedback from our service provider, there is no white- only blacklisting which, if active, would mean the ip is completely blocked. No request would even reach our servers. So as we still see requests coming in, even while the test is running, there must be something else going wrong.
I just think it's strange, that we run our own monitoring and are also beeing monitored by other Websites (e.g. https://directory.spatineo.com/service/156335/) reaching nearly 100% Uptime for normal requests, while every request you perform out of your own network with a local postman ends up in a timeout. While the only thing we see on our side are "working" requests and replies with either 200 or 400. From a network/server perspective this doesn't make sense and the problem should have it's source on an upper layer.
Thanks for the insights! I will do further investigations.
We see the postman requests but no timeout. Please share your collection of requests, so I will test it from my postman webservice workspace.
Could you share your email or postman account name? You can use my email address, if you like: b.pross @52north.org
postman account is also someoneLookingForHelp
I somehow cannot share using the username, can you access this collection? https://elements.getpostman.com/redirect?entityId=1016817-c4ab14ff-64b8-4cca-a8ec-6eab6760c783&entityType=collection
I just executed a test with https://cite.ogc.org/teamengine/rest/suites/wfs20/run?wfs=https%3A%2F%2Fschullandschaft.brandenburg.de%2Fedugis%2Fwfs%2Fus-govserv-education%2Fkitas%3FACCEPTVERSIONS%3D2.0.0%26request%3DGetCapabilities%26service%3DWFS%26VERSION%3D2.0.0 The first run was successful and I retrieved a validation report.
However, when executing the test suite a second time using the same URL, following error message was returned:
XML-Verarbeitungsfehler: nicht wohlgeformt
Adresse: https://cite.ogc.org/teamengine/rest/suites/wfs20/run?wfs=https%3A%2F%2Fschullandschaft.brandenburg.de%2Fedugis%2Fwfs%2Fus-govserv-education%2Fkitas%3FACCEPTVERSIONS%3D2.0.0%26request%3DGetCapabilities%26service%3DWFS%26VERSION%3D2.0.0
Zeile Nr. 5, Spalte 207:
<body><p>Error executing test suite (wfs20): Error message: Failed to connect to resource located at https://schullandschaft.brandenburg.de/edugis/wfs/us-govserv-education/kitas?ACCEPTVERSIONS=2.0.0&request=GetCapabilities&service=WFS&VERSION=2.0.0</p></body>
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------^
Thus, it seems that the requests of TEAM Engine were rejected by your server.
I'm at a loss.
@bpross-52n: I ran the requests of the collection. I get replies with status codes 200, 400 and 403 but no timeout. Now I've also done a performance test with postman, and here too all the requests go through. Does the test suite send multiple requests at the same time?
@dstenger: We've tried to get blocked by making requests to our server in the last few days, but that hasn't worked so far. Can you please specify which requests are blocked and when (timestamp)?
It seems that this request was not returning a response: https://schullandschaft.brandenburg.de/edugis/wfs/us-govserv-education/kitas?ACCEPTVERSIONS=2.0.0&request=GetCapabilities&service=WFS&VERSION=2.0.0 The request was sent on 21st of March at around 10:40 AM CET.
INSPIRE Reference Validator: The Test Suite was not executed because the Test Driver returned an error. Error message OGC TEAM Engine returned HTTP status code: 500 (Internal Server Error). Message: Error executing test suite (wfs20): Error message: Failed to connect to resource located at https://schullandschaft.brandenburg.de/edugis/wfs/us-govserv-education/kitas?ACCEPTVERSIONS=2.0.0&request=GetCapabilities&service=WFS&VERSION=2.0.0 Can someone help me?