DescartesResearch / TeaStore

A micro-service reference test application for model extraction, cloud management, energy efficiency, power prediction, single- and multi-tier auto-scaling
https://se.informatik.uni-wuerzburg.de
Apache License 2.0
118 stars 136 forks source link

Image service REST state/finished endpoints return inconsistent state #230

Closed SoerenHenning closed 1 year ago

SoerenHenning commented 1 year ago

This issue might be related to my other issue (#220), so maybe getting the logs will provide further details.

I am deploying the TeaStore Image service with multiple replicas in Kubernetes with the Cluster IP setup. Additionally, I am using Open Service Mesh (as in #220) so I cannot rule out that OSM is somehow related to this issue. Additionally, I added a Kubernetes readinessProbe to the Image service deployment as done in your ICPE paper.

Irregularly, I experience some Pods not going in a Ready state due to the failing readinessProbe. execing into the Pods, I can confirm the finished immediately returns false (no timeout or so):

$ curl localhost:8080/tools.descartes.teastore.image/rest/image/finished
false

However, when calling the state endpoint, I get the following response, saying that creation should be finished:

$ curl localhost:8080/tools.descartes.teastore.image/rest/image/state
Image Provider State:
---------------------
Working Directory: /usr/local/tomcat/images
Storage Mode: Drive
Storage Rule: All
Caching Mode: LFU
Caching Rule: All
Creator Thread: Finished
Images Created: 0 / 0
Pre-Existing Images Found: 11
Category Images Found: 9

After waiting some more time (~20 minutes), some but not all instances get into a ready state. What I am now wondering is whether this behavior is expected and if I really have to wait for more than 20 minutes to all instances becoming ready or if there is something strange going on.

I observed similar effects with the Persistence and with the Recommender service, where the finished/isready endpoints return false, although other endpoints are working as expected. I haven't looked into details here, but can add more information on these services later.

Just as a site note: the underlying Kubernetes cluster should be sufficiently dimensioned as CPU and memory utilization is below 10%.

SimonEismann commented 1 year ago

Hi Sören, we've hopefully resolved the logging issues from #220 which should help us get some more information on what's going on here :) Could you test this again with the (hopefully) working logs and see if there is anything useful in the logs?

SimonEismann commented 1 year ago

Totally forgot to mention, that what you are experiencing is not the expected behavior. The image provider instances should become ready quite quickly.

SoerenHenning commented 1 year ago

The logs definitely helped and I think I'm a few steps further now. The Image and Recommender service seems to work fine. My original issue was (probably) caused by a combination of exponential retries (up to 2 minutes) when connecting to the Persistence service and some Persistence service pods not being ready. That means, I think these Persistence pods are the real problem, which I'm now going to explain further.

I'm starting 4 instances of the Persistence service, 1 instance of the database and the registry and 15 instances of all the other 4 services. The Persistence readiness probe is configured to call /tools.descartes.teastore.persistence/rest/generatedb/finished. I observed that only 1 of the 4 pods is getting in a ready state, which is the one which was generating the database content:

$ kubectl get pods -l run=teastore-persistence
NAME                                    READY   STATUS    RESTARTS   AGE
teastore-persistence-6b6cc485b6-28v8l   1/2     Running   0          5m9s   <-- that pod found a populated database 
teastore-persistence-6b6cc485b6-f648g   1/2     Running   0          5m1s   <-- that pod found a populated database 
teastore-persistence-6b6cc485b6-gshx2   1/2     Running   0          4m26s  <-- that pod found a populated database 
teastore-persistence-6b6cc485b6-pvd87   2/2     Running   0          5m14s  <-- that pod was generating the database content

I attached the logs for two of these pods at the end of this comment. When exec-ing into the non-ready pod, I can see that the generatedb/finished endpoint return false, but the connection to the database (calling categories/5) seems to work:

$ kubectl exec -it teastore-persistence-6b6cc485b6-28v8l bash
root@teastore-persistence-6b6cc485b6-28v8l:/usr/local/tomcat# curl localhost:8080/tools.descartes.teastore.persistence/rest/generatedb/finished
false
root@teastore-persistence-6b6cc485b6-28v8l:/usr/local/tomcat# curl localhost:8080/tools.descartes.teastore.persistence/rest/categories/5
{"id":5,"name":"Rooibos","description":"In many variations"}

Logs of the pod that generated the database:

$ kubectl logs -f teastore-persistence-6b6cc485b6-pvd87 teastore-persistence 
Setting heap space to 2048000 KiB
17-Aug-2022 10:20:50.956 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Server version name:   Apache Tomcat/10.0.7
17-Aug-2022 10:20:50.959 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Server built:          Jun 8 2021 16:31:06 UTC
17-Aug-2022 10:20:50.959 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Server version number: 10.0.7.0
17-Aug-2022 10:20:50.959 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log OS Name:               Linux
17-Aug-2022 10:20:50.959 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log OS Version:            4.19.0-18-amd64
17-Aug-2022 10:20:50.960 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Architecture:          amd64
17-Aug-2022 10:20:50.960 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Java Home:             /usr/local/openjdk-11
17-Aug-2022 10:20:50.960 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log JVM Version:           11.0.11+9
17-Aug-2022 10:20:50.960 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log JVM Vendor:            Oracle Corporation
17-Aug-2022 10:20:50.960 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log CATALINA_BASE:         /usr/local/tomcat
17-Aug-2022 10:20:50.960 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log CATALINA_HOME:         /usr/local/tomcat
17-Aug-2022 10:20:50.970 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Djava.util.logging.config.file=/usr/local/tomcat/conf/logging.properties
17-Aug-2022 10:20:50.970 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager
17-Aug-2022 10:20:50.970 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -javaagent:/kieker/agent/agent.jar
17-Aug-2022 10:20:50.970 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: --add-opens=java.base/java.lang=ALL-UNNAMED
17-Aug-2022 10:20:50.970 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Dkieker.monitoring.configuration=/kieker/config/kieker.monitoring.properties
17-Aug-2022 10:20:50.970 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Daj.weaving.verbose=false
17-Aug-2022 10:20:50.970 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Dorg.aspectj.weaver.loadtime.configuration=aop.xml
17-Aug-2022 10:20:50.970 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Dkieker.monitoring.skipDefaultAOPConfiguration=true
17-Aug-2022 10:20:50.971 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Daj.weaving.loadersToSkip=java.net.URLClassLoader
17-Aug-2022 10:20:50.971 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Dcom.sun.jndi.ldap.object.disableEndpointIdentification=true
17-Aug-2022 10:20:50.971 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Djdk.tls.ephemeralDHKeySize=2048
17-Aug-2022 10:20:50.971 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Djava.protocol.handler.pkgs=org.apache.catalina.webresources
17-Aug-2022 10:20:50.971 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Dorg.apache.catalina.security.SecurityListener.UMASK=0027
17-Aug-2022 10:20:50.971 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Dignore.endorsed.dirs=
17-Aug-2022 10:20:50.971 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Dcatalina.base=/usr/local/tomcat
17-Aug-2022 10:20:50.971 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Dcatalina.home=/usr/local/tomcat
17-Aug-2022 10:20:50.971 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Djava.io.tmpdir=/usr/local/tomcat/temp
17-Aug-2022 10:20:50.975 INFO [main] org.apache.catalina.core.AprLifecycleListener.lifecycleEvent Loaded Apache Tomcat Native library [1.2.30] using APR version [1.6.5].
17-Aug-2022 10:20:50.975 INFO [main] org.apache.catalina.core.AprLifecycleListener.lifecycleEvent APR capabilities: IPv6 [true], sendfile [true], accept filters [false], random [true], UDS [true].
17-Aug-2022 10:20:50.977 INFO [main] org.apache.catalina.core.AprLifecycleListener.initializeSSL OpenSSL successfully initialized [OpenSSL 1.1.1d  10 Sep 2019]
17-Aug-2022 10:20:51.238 INFO [main] org.apache.coyote.AbstractProtocol.init Initializing ProtocolHandler ["http-nio-8080"]
17-Aug-2022 10:20:51.258 INFO [main] org.apache.coyote.AbstractProtocol.init Initializing ProtocolHandler ["ajp-nio-127.0.0.1-8009"]
17-Aug-2022 10:20:51.259 INFO [main] org.apache.catalina.startup.Catalina.load Server initialization in [461] milliseconds
17-Aug-2022 10:20:51.296 INFO [main] org.apache.catalina.core.StandardService.startInternal Starting service [Catalina]
17-Aug-2022 10:20:51.296 INFO [main] org.apache.catalina.core.StandardEngine.startInternal Starting Servlet engine: [Apache Tomcat/10.0.7]
17-Aug-2022 10:20:51.308 INFO [main] org.apache.catalina.startup.HostConfig.deployWAR Deploying web application archive [/usr/local/tomcat/webapps/tools.descartes.teastore.persistence.war]
17-Aug-2022 10:20:52.328 INFO [main] org.apache.jasper.servlet.TldScanner.scanJars At least one JAR was scanned for TLDs yet contained no TLDs. Enable debug logging for this logger for a complete list of JARs that were scanned but no TLDs were found in them. Skipping unneeded JARs during scanning can improve startup time and JSP compilation time.
[main] INFO tools.descartes.teastore.persistence.repository.EMFManager - Using MySQL/MariaDB database.
[main] INFO tools.descartes.teastore.persistence.repository.EMFManager - Setting jdbc url to "jdbc:mysql://teastore-db:3306/teadb".
TEST
[main] WARN tools.descartes.teastore.persistence.daemons.InitialDataGenerationDaemon - Exception connecting to database. Is database offline? Wating for 2000 ms.
[main] INFO tools.descartes.teastore.persistence.daemons.InitialDataGenerationDaemon - Database is empty. Generating new database content
[main] INFO kieker.monitoring.core.configuration.ConfigurationFactory - Loading configuration from JVM-specified location: '/kieker/config/kieker.monitoring.properties'
[main] INFO kieker.monitoring.core.controller.TCPController - Could not parse port for the TCPController, deactivating this option. Received string was: 
WriterController: WriterClassName: kieker.monitoring.writer.filesystem.FileWriter
[main] INFO kieker.monitoring.core.controller.MonitoringController - Current State of kieker.monitoring (1.15-SNAPSHOT) Status: 'disabled'
        Name: 'KIEKER'; Hostname: 'teastore-persistence-6b6cc485b6-pvd87'; experimentID: '1'
JMXController: JMX disabled
TimeSource: 'kieker.monitoring.timer.SystemNanoTimer'
        Time in nanoseconds (with nanoseconds precision) since Thu Jan 01 00:00:00 UTC 1970'
ProbeController: disabled
WriterController:
        Queue type: class kieker.monitoring.queue.BlockingQueueDecorator
        Queue capacity: 10000
        Insert behavior (a.k.a. QueueFullBehavior): class kieker.monitoring.queue.behavior.BlockOnFailedInsertBehavior
                numBlocked: 0
Writer: 'kieker.monitoring.writer.filesystem.FileWriter'
        Configuration:
                kieker.monitoring.writer.filesystem.FileWriter.logFilePoolHandler='kieker.monitoring.writer.filesystem.RotatingLogFilePoolHandler'
                kieker.monitoring.writer.filesystem.FileWriter.PrioritizedQueueSize='100'
                kieker.monitoring.writer.filesystem.FileWriter.charsetName='UTF-8'
                kieker.monitoring.writer.filesystem.FileWriter.logStreamHandler='kieker.monitoring.writer.filesystem.TextLogStreamHandler'
                kieker.monitoring.writer.filesystem.FileWriter.bufferSize='8192'
                kieker.monitoring.writer.filesystem.FileWriter.maxEntriesInFile='25000'
                kieker.monitoring.writer.filesystem.FileWriter.maxLogFiles='-1'
                kieker.monitoring.writer.filesystem.FileWriter.MaxShutdownDelay='-1'
                kieker.monitoring.writer.filesystem.FileWriter.maxLogSize='-1'
                kieker.monitoring.writer.filesystem.FileWriter.mapFileHandler='kieker.monitoring.writer.filesystem.TextMapFileHandler'
                kieker.monitoring.writer.filesystem.FileWriter.flush='true'
                kieker.monitoring.writer.filesystem.FileWriter.QueueFullBehavior='0'
                kieker.monitoring.writer.filesystem.FileWriter.QueueSize='10000'
                kieker.monitoring.writer.filesystem.FileWriter.customStoragePath='/kieker/logs/'
                kieker.monitoring.writer.filesystem.FileWriter.actualStoragePath='/kieker/logs/kieker-20220817-102126-17726501559145506-UTC--KIEKER'

        Automatic assignment of logging timestamps: 'true'
Sampling Controller: Periodic Sensor available: Poolsize: '0'; Scheduled Tasks: '0'
[main] INFO kieker.monitoring.core.registry.ControlFlowRegistry - First threadId will be 4039588128262979584
[main] INFO tools.descartes.teastore.persistence.daemons.InitialDataGenerationDaemon - Persistence finished initializing database
[main] INFO tools.descartes.teastore.persistence.daemons.InitialDataGenerationDaemon - Persistence started registration daemon
17-Aug-2022 10:22:16.233 WARNING [main] org.glassfish.jersey.server.wadl.WadlFeature.configure JAX-B API not found . WADL feature is disabled.
17-Aug-2022 10:22:16.726 INFO [main] org.apache.catalina.startup.HostConfig.deployWAR Deployment of web application archive [/usr/local/tomcat/webapps/tools.descartes.teastore.persistence.war] has finished in [85,417] ms
17-Aug-2022 10:22:16.729 INFO [main] org.apache.coyote.AbstractProtocol.start Starting ProtocolHandler ["http-nio-8080"]
17-Aug-2022 10:22:16.740 INFO [main] org.apache.coyote.AbstractProtocol.start Starting ProtocolHandler ["ajp-nio-127.0.0.1-8009"]
17-Aug-2022 10:22:16.743 INFO [main] org.apache.catalina.startup.Catalina.start Server startup in [85484] milliseconds

Logs of one of the pods that used the populated database:

$ kubectl logs -f teastore-persistence-6b6cc485b6-28v8l -c teastore-persistence 
Setting heap space to 2048000 KiB
17-Aug-2022 10:20:56.704 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Server version name:   Apache Tomcat/10.0.7
17-Aug-2022 10:20:56.707 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Server built:          Jun 8 2021 16:31:06 UTC
17-Aug-2022 10:20:56.707 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Server version number: 10.0.7.0
17-Aug-2022 10:20:56.707 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log OS Name:               Linux
17-Aug-2022 10:20:56.707 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log OS Version:            4.19.0-18-amd64
17-Aug-2022 10:20:56.707 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Architecture:          amd64
17-Aug-2022 10:20:56.707 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Java Home:             /usr/local/openjdk-11
17-Aug-2022 10:20:56.708 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log JVM Version:           11.0.11+9
17-Aug-2022 10:20:56.708 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log JVM Vendor:            Oracle Corporation
17-Aug-2022 10:20:56.708 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log CATALINA_BASE:         /usr/local/tomcat
17-Aug-2022 10:20:56.708 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log CATALINA_HOME:         /usr/local/tomcat
17-Aug-2022 10:20:56.717 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Djava.util.logging.config.file=/usr/local/tomcat/conf/logging.properties
17-Aug-2022 10:20:56.717 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager
17-Aug-2022 10:20:56.717 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -javaagent:/kieker/agent/agent.jar
17-Aug-2022 10:20:56.717 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: --add-opens=java.base/java.lang=ALL-UNNAMED
17-Aug-2022 10:20:56.718 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Dkieker.monitoring.configuration=/kieker/config/kieker.monitoring.properties
17-Aug-2022 10:20:56.718 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Daj.weaving.verbose=false
17-Aug-2022 10:20:56.718 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Dorg.aspectj.weaver.loadtime.configuration=aop.xml
17-Aug-2022 10:20:56.718 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Dkieker.monitoring.skipDefaultAOPConfiguration=true
17-Aug-2022 10:20:56.718 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Daj.weaving.loadersToSkip=java.net.URLClassLoader
17-Aug-2022 10:20:56.718 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Dcom.sun.jndi.ldap.object.disableEndpointIdentification=true
17-Aug-2022 10:20:56.718 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Djdk.tls.ephemeralDHKeySize=2048
17-Aug-2022 10:20:56.718 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Djava.protocol.handler.pkgs=org.apache.catalina.webresources
17-Aug-2022 10:20:56.718 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Dorg.apache.catalina.security.SecurityListener.UMASK=0027
17-Aug-2022 10:20:56.718 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Dignore.endorsed.dirs=
17-Aug-2022 10:20:56.718 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Dcatalina.base=/usr/local/tomcat
17-Aug-2022 10:20:56.719 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Dcatalina.home=/usr/local/tomcat
17-Aug-2022 10:20:56.719 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Djava.io.tmpdir=/usr/local/tomcat/temp
17-Aug-2022 10:20:56.722 INFO [main] org.apache.catalina.core.AprLifecycleListener.lifecycleEvent Loaded Apache Tomcat Native library [1.2.30] using APR version [1.6.5].
17-Aug-2022 10:20:56.722 INFO [main] org.apache.catalina.core.AprLifecycleListener.lifecycleEvent APR capabilities: IPv6 [true], sendfile [true], accept filters [false], random [true], UDS [true].
17-Aug-2022 10:20:56.724 INFO [main] org.apache.catalina.core.AprLifecycleListener.initializeSSL OpenSSL successfully initialized [OpenSSL 1.1.1d  10 Sep 2019]
17-Aug-2022 10:20:56.970 INFO [main] org.apache.coyote.AbstractProtocol.init Initializing ProtocolHandler ["http-nio-8080"]
17-Aug-2022 10:20:56.988 INFO [main] org.apache.coyote.AbstractProtocol.init Initializing ProtocolHandler ["ajp-nio-127.0.0.1-8009"]
17-Aug-2022 10:20:56.989 INFO [main] org.apache.catalina.startup.Catalina.load Server initialization in [446] milliseconds
17-Aug-2022 10:20:57.027 INFO [main] org.apache.catalina.core.StandardService.startInternal Starting service [Catalina]
17-Aug-2022 10:20:57.027 INFO [main] org.apache.catalina.core.StandardEngine.startInternal Starting Servlet engine: [Apache Tomcat/10.0.7]
17-Aug-2022 10:20:57.038 INFO [main] org.apache.catalina.startup.HostConfig.deployWAR Deploying web application archive [/usr/local/tomcat/webapps/tools.descartes.teastore.persistence.war]
17-Aug-2022 10:20:58.142 INFO [main] org.apache.jasper.servlet.TldScanner.scanJars At least one JAR was scanned for TLDs yet contained no TLDs. Enable debug logging for this logger for a complete list of JARs that were scanned but no TLDs were found in them. Skipping unneeded JARs during scanning can improve startup time and JSP compilation time.
[main] INFO tools.descartes.teastore.persistence.repository.EMFManager - Using MySQL/MariaDB database.
[main] INFO tools.descartes.teastore.persistence.repository.EMFManager - Setting jdbc url to "jdbc:mysql://teastore-db:3306/teadb".
[main] WARN tools.descartes.teastore.persistence.daemons.InitialDataGenerationDaemon - Exception connecting to database. Is database offline? Wating for 2000 ms.
TEST
[main] INFO tools.descartes.teastore.persistence.daemons.InitialDataGenerationDaemon - Populated database found. Skipping data generation
[main] INFO tools.descartes.teastore.persistence.daemons.InitialDataGenerationDaemon - Persistence finished initializing database
[main] INFO tools.descartes.teastore.persistence.daemons.InitialDataGenerationDaemon - Persistence started registration daemon
[main] INFO kieker.monitoring.core.configuration.ConfigurationFactory - Loading configuration from JVM-specified location: '/kieker/config/kieker.monitoring.properties'
[main] INFO kieker.monitoring.core.controller.TCPController - Could not parse port for the TCPController, deactivating this option. Received string was: 
WriterController: WriterClassName: kieker.monitoring.writer.filesystem.FileWriter
[main] INFO kieker.monitoring.core.controller.MonitoringController - Current State of kieker.monitoring (1.15-SNAPSHOT) Status: 'disabled'
        Name: 'KIEKER'; Hostname: 'teastore-persistence-6b6cc485b6-28v8l'; experimentID: '1'
JMXController: JMX disabled
TimeSource: 'kieker.monitoring.timer.SystemNanoTimer'
        Time in nanoseconds (with nanoseconds precision) since Thu Jan 01 00:00:00 UTC 1970'
ProbeController: disabled
WriterController:
        Queue type: class kieker.monitoring.queue.BlockingQueueDecorator
        Queue capacity: 10000
        Insert behavior (a.k.a. QueueFullBehavior): class kieker.monitoring.queue.behavior.BlockOnFailedInsertBehavior
                numBlocked: 0
Writer: 'kieker.monitoring.writer.filesystem.FileWriter'
        Configuration:
                kieker.monitoring.writer.filesystem.FileWriter.logFilePoolHandler='kieker.monitoring.writer.filesystem.RotatingLogFilePoolHandler'
                kieker.monitoring.writer.filesystem.FileWriter.PrioritizedQueueSize='100'
                kieker.monitoring.writer.filesystem.FileWriter.charsetName='UTF-8'
                kieker.monitoring.writer.filesystem.FileWriter.logStreamHandler='kieker.monitoring.writer.filesystem.TextLogStreamHandler'
                kieker.monitoring.writer.filesystem.FileWriter.bufferSize='8192'
                kieker.monitoring.writer.filesystem.FileWriter.maxEntriesInFile='25000'
                kieker.monitoring.writer.filesystem.FileWriter.maxLogFiles='-1'
                kieker.monitoring.writer.filesystem.FileWriter.MaxShutdownDelay='-1'
                kieker.monitoring.writer.filesystem.FileWriter.maxLogSize='-1'
                kieker.monitoring.writer.filesystem.FileWriter.mapFileHandler='kieker.monitoring.writer.filesystem.TextMapFileHandler'
                kieker.monitoring.writer.filesystem.FileWriter.flush='true'
                kieker.monitoring.writer.filesystem.FileWriter.QueueFullBehavior='0'
                kieker.monitoring.writer.filesystem.FileWriter.QueueSize='10000'
                kieker.monitoring.writer.filesystem.FileWriter.customStoragePath='/kieker/logs/'
                kieker.monitoring.writer.filesystem.FileWriter.actualStoragePath='/kieker/logs/kieker-20220817-102131-23261810674870043-UTC--KIEKER'

        Automatic assignment of logging timestamps: 'true'
Sampling Controller: Periodic Sensor available: Poolsize: '0'; Scheduled Tasks: '0'
[main] INFO kieker.monitoring.core.registry.ControlFlowRegistry - First threadId will be 5951084695105044480
17-Aug-2022 10:21:32.149 WARNING [main] org.glassfish.jersey.server.wadl.WadlFeature.configure JAX-B API not found . WADL feature is disabled.
17-Aug-2022 10:21:32.532 INFO [main] org.apache.catalina.startup.HostConfig.deployWAR Deployment of web application archive [/usr/local/tomcat/webapps/tools.descartes.teastore.persistence.war] has finished in [35,494] ms
17-Aug-2022 10:21:32.535 INFO [main] org.apache.coyote.AbstractProtocol.start Starting ProtocolHandler ["http-nio-8080"]
17-Aug-2022 10:21:32.546 INFO [main] org.apache.coyote.AbstractProtocol.start Starting ProtocolHandler ["ajp-nio-127.0.0.1-8009"]
17-Aug-2022 10:21:32.551 INFO [main] org.apache.catalina.startup.Catalina.start Server startup in [35562] milliseconds
SimonEismann commented 1 year ago

I've tried to replicate this issue locally by running a docker compose file with two persistence containers and here the issue unfortunately does not show up as both containers report as ready :/

Below is the code responsible for database readiness:

    public boolean getGenerationFinishedFlag() {
        if (isMaintenanceMode()) {
            return false;
        }
        boolean finishedGenerating = false;
        EntityManager em = CategoryRepository.REPOSITORY.getEM();
        try {
            List<DatabaseManagementEntity> entities =
                    em.createQuery("SELECT u FROM "
                            + DatabaseManagementEntity.class.getName()
                            + " u", DatabaseManagementEntity.class)
                    .getResultList();
            if (entities != null && !entities.isEmpty()) {
                finishedGenerating = entities.get(0).isFinishedGenerating();
            }
        } finally {
            em.close();
        }
        return finishedGenerating;
    }

Based on this, I can see two possible reasons for the behavior you are experiencing:

If it is not the maintenance mode, I could push a version out to develop that includes detailed logging in the getGenerationFinishedFlag() method, since I can not reproduce the issue myself.

SimonEismann commented 1 year ago

Forgot to mention, we probably exclude anything about the database state being the issue since this would mean that all pods do not report readiness

SoerenHenning commented 1 year ago

Sorry, closed by mistake.

SimonEismann commented 1 year ago

Btw if you want you can also reach out via email and we can schedule a quick call to take a look at this together since I can't reproduce the issue locally.

SimonEismann commented 1 year ago

Tracked this down to be an issue with the cache coherence in the Kubernetes deployment, resolved by disabling the cache for the readiness state