Closed MichaelLangbein closed 1 year ago
Last entries of wps-init
log:
* Connected to riesgos-wps (172.28.0.3) port 8080 (#0)
* Server auth using Basic with user 'admin'
> PUT /geoserver/rest/styles/style-damagestate-sara-plasma HTTP/1.1
> Host: riesgos-wps:8080
> Authorization: Basic YWRtaW46Z2Vvc2VydmVy
> User-Agent: curl/8.1.1
> Accept: */*
> Content-type: application/vnd.ogc.sld+xml
> Content-Length: 6752
>
} [6752 bytes data]
< HTTP/1.1 200
< Date: Thu, 25 May 2023 14:07:47 GMT
< Server: Noelios-Restlet-Engine/1.0..8
< Transfer-Encoding: chunked
<
{ [5 bytes data]
100 6752 0 0 100 6752 0 757k --:--:-- --:--:-- --:--:-- 824k
* Connection #0 to host riesgos-wps left intact
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying 172.28.0.3:8080...
* Connected to riesgos-wps (172.28.0.3) port 8080 (#0)
* Server auth using Basic with user 'admin'
> DELETE /geoserver/rest/styles/style-damagestate-suppasri-plasma HTTP/1.1
> Host: riesgos-wps:8080
> Authorization: Basic YWRtaW46Z2Vvc2VydmVy
> User-Agent: curl/8.1.1
> Accept: */*
> Content-type: text/xml
>
< HTTP/1.1 404
< Content-Type: text/plain
< Transfer-Encoding: chunked
< Date: Thu, 25 May 2023 14:07:47 GMT
<
{ [54 bytes data]
100 48 0 48 0 0 16460 0 --:--:-- --:--:-- --:--:-- 24000
* Connection #0 to host riesgos-wps left intact
Note: Unnecessary use of -X or --request, POST is already inferred.
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying 172.28.0.3:8080...
* Connected to riesgos-wps (172.28.0.3) port 8080 (#0)
* Server auth using Basic with user 'admin'
> POST /geoserver/rest/styles HTTP/1.1
> Host: riesgos-wps:8080
> Authorization: Basic YWRtaW46Z2Vvc2VydmVy
> User-Agent: curl/8.1.1
> Accept: */*
> Content-type: text/xml
> Content-Length: 119
>
} [119 bytes data]
< HTTP/1.1 201
< Date: Thu, 25 May 2023 14:07:47 GMT
< Location: http://riesgos-wps:8080/geoserver/rest/styles/style-damagestate-suppasri-plasma
< Server: Noelios-Restlet-Engine/1.0..8
< Transfer-Encoding: chunked
<
{ [5 bytes data]
100 119 0 0 100 119 0 19343 --:--:-- --:--:-- --:--:-- 19833
* Connection #0 to host riesgos-wps left intact
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying 172.28.0.3:8080...
* Connected to riesgos-wps (172.28.0.3) port 8080 (#0)
* Server auth using Basic with user 'admin'
> PUT /geoserver/rest/styles/style-damagestate-suppasri-plasma HTTP/1.1
> Host: riesgos-wps:8080
> Authorization: Basic YWRtaW46Z2Vvc2VydmVy
> User-Agent: curl/8.1.1
> Accept: */*
> Content-type: application/vnd.ogc.sld+xml
> Content-Length: 6760
>
} [6760 bytes data]
< HTTP/1.1 200
< Date: Thu, 25 May 2023 14:07:47 GMT
< Server: Noelios-Restlet-Engine/1.0..8
< Transfer-Encoding: chunked
<
{ [5 bytes data]
100 6760 0 0 100 6760 0 768k --:--:-- --:--:-- --:--:-- 825k
* Connection #0 to host riesgos-wps left intact
No such style: shakemap-pgaNo such style: style-damagestate-saraNo such style: style-damagestate-suppasriNo such style: style-damagestate-medinaNo such style: style-transitionsNo such style: style-lossNo such style: style-cum-lossNo such style: style-cum-loss-chile-plasmaNo such style: style-cum-loss-peru-plasmaNo such style: style-damagestate-medina-plasmaNo such style: style-damagestate-sara-plasmaNo such style: style-damagestate-suppasri-plasma
The weird thing is that shakemap-pga.sld
is actually present in /styles/.
style-damagestate-suppasri
also exists (under the name style_damagestate_suppasri.sld
, which is written correctly in add-style-to-geoserver.sh
)
Also weird is that this step happens after step_tomcat_reload_settings
... so all important tasks should have been completed by wps-init
already.
Maybe we need to re-start the tomcat-servlets after boot?
The 404 for the delete can be explained: The script tries to delete existing styles before sending the new value to the server. If it is not found it returns an 404.
The overall logic is fine with the 404 - as it means that there is nothing to delete. The script can go on with adding the styles then. And you also see the 201 response for the post (created) and the 200 for the put (success).
But all those status codes are just the init script - and for urls to the geoserver.
In case we have 404 by the wps, we have to check those.
(It is also intentend that the style transfer is done after the reload of the tomcat settings. I could set them within the file system (as I mounted it in the wps container) but then I'm not able anymore to reuse the existing script to upload the files. So in case I make changes there, an extra handling in the buildall repo here may no longer be up to date.)
What exactly is does your monitor test?
I currently build a little script that calls the getCapabilities request every 10 seconds & I will see if it will throw any kind of error
What exactly is does your monitor test?
It tries to execute a full run of all available scenarios once every x minutes. I've deactivated the monitor for now, though. Here's where that test happens: https://github.com/riesgos/dlr-riesgos-frontend/blob/70fcc6762416f8a0c52cadb87c60661e7051e9c4/monitor/src/runall.ts#L46
In case we have 404 by the wps, we have to check those.
Yeah, I'm getting those from the frontend when executing a wps, too.
Here's a reproduction:
Running 11/11
✔ Network buildall_default Created 0.0s
✔ Volume "buildall_tomcat-folder" Created 0.0s
✔ Volume "buildall_logs" Created 0.0s
✔ Volume "buildall_store" Created 0.0s
✔ Container buildall-backend-1 Started 0.5s
✔ Container buildall-riesgos-wps-1 Started 0.4s
✔ Container buildall-monitor-1 Started 1.1s
✔ Container buildall-frontend-1 Started 1.2s
✔ Container buildall-compare-frontend-1 Started 0.8s
✔ Container buildall-wps-init-1 Started 0.9s
✔ Container buildall-reverse_proxy-1 Started 0.9s
lang_m13-a@eoc:/localhome/lang_m13/Desktop/code/buildall$ sudo docker container exec -it buildall-backend-1 /bin/bash
root@e0a42776a762:/backend# curl http://riesgos-wps:8080
<html><body><a href="/wps-js-client">WPS JS Client</a><br><a href="/wps/WebProcessingService?Request=GetCapabilities&Service=WPS">WebProcessingService</a></body></html>
root@e0a42776a762:/backend# curl http://riesgos-wps:8080/wps
root@e0a42776a762:/backend# curl http://riesgos-wps:8080/wps/WebProcessingService
<!doctype html><html lang="en"><head><title>HTTP Status 404 – Not Found</title><style type="text/css">body {font-family:Tahoma,Arial,sans-serif;} h1, h2, h3, b {color:white;background-color:#525D76;} h1 {font-size:22px;} h2 {font-size:16px;} h3 {font-size:14px;} p {font-size:12px;} a {color:black;} .line {height:1px;background-color:#525D76;border:none;}</style></head><body><h1>HTTP Status 404 – Not Found</h1><hr class="line" /><p><b>Type</b> Status Report</p><p><b>Description</b> The origin server did not find a current representation for the target resource or is not willing to disclose that one exists.</p><hr class="line" /><h3>Apache Tomcat/9.0.65</h3></body></html>
root@e0a42776a762:/backend#
To reproduce:
docker compose -f docker-compose.yml -f ./docker-compose.dlr.yml --env-file .env down -v
http://<your-ip>:8000
to see the frontend (in an anonymous tab to make sure no browser cache is being applied)The process will fail on a first attempt with " connect ECONNREFUSED 127.0.0.1:8080" and on all subsequent attempts (even after wps is fully booted up) with "404".
Note:
docker container run --rm -it -v=buildall_logs:/mounts/logs -v buildall_store:/mounts/store busybox tree /mounts
Riesgos-wps outputs the following error-message:
30-May-2023 13:28:01.257 INFO [GuavaAuthCache-0-1] org.apache.catalina.loader.WebappClassLoaderBase.checkStateForResourceLoading Illegal access: this web application instance has been stopped already. Could not load [com.google.common.cache.RemovalCause]. The following stack trace is thrown for debugging purposes as well as to attempt to terminate the thread which caused the illegal access.
java.lang.IllegalStateException: Illegal access: this web application instance has been stopped already. Could not load [com.google.common.cache.RemovalCause]. The following stack trace is thrown for debugging purposes as well as to attempt to terminate the thread which caused the illegal access.
at org.apache.catalina.loader.WebappClassLoaderBase.checkStateForResourceLoading(WebappClassLoaderBase.java:1432)
at org.apache.catalina.loader.WebappClassLoaderBase.checkStateForClassLoading(WebappClassLoaderBase.java:1420)
at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1259)
at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1220)
at com.google.common.cache.LocalCache$Segment.expireEntries(LocalCache.java:2622)
at com.google.common.cache.LocalCache$Segment.runLockedCleanup(LocalCache.java:3446)
at com.google.common.cache.LocalCache$Segment.cleanUp(LocalCache.java:3438)
at com.google.common.cache.LocalCache.cleanUp(LocalCache.java:3858)
at com.google.common.cache.LocalCache$LocalManualCache.cleanUp(LocalCache.java:4797)
at org.geoserver.security.auth.GuavaAuthenticationCacheImpl$1.run(GuavaAuthenticationCacheImpl.java:65)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Seems like we need to make sure riesgos-wps is completely booted before sending requests to it. A few ways of doing this:
frontend: depends_on:
GetCapabilities
before allowing postswait-for-it
script: https://git.gfz-potsdam.de/api/v4/projects/1796/repository/files/wait-for-it.sh/raw?ref=81b1373f17855a4dc21156cfe1694c31d7d1792eOk, latest version of backend now has a mechanism where a user can define tests that must pass before the API is available. That should solve this issue. Also, since the build-script currently automatically fetches the dev-branch of the backend, this should immediately appy after a rebuild.
Still trying to reproduce when exactly this happens.
What I've observed so far:
This might have a lot to do with my development-machine, though ... still investigating.
UPDATE: I've deactivated the monitor for now. I found that no errors get thrown anymore ~10 minutes after the containers are up, but this only seems to work after having re-started the wps once.