Open lancechant opened 3 months ago
Hi @lancechant,
Thank you for reaching out.
Please provide the below information to help with our investigation:
Hi @leochr here are the responses:
@lancechant Thank you for the additional details.
Could you please provide the server log? We are interested to see the events leading up to the startup of the Liberty server and the application including the various endpoints. You can trim out the rest.
Are you defining custom health checks by implementing the interface org.eclipse.microprofile.health.HealthCheck
from MicroProfile Health? If so, could you please provide details of those (Readiness, Liveness, Startup)? Are they part of the same application?
Hi @leochr Attached are the logs from the docker container, tried to leave as much there as possible, and just removed some information, I hope it helps.
Yes we are, and it's a dependency from another lib we have that is brought in with Maven, the health check is annotated with the Liveness
and readiness
annotations. And they are just checking for example connections to the database and MQ are live and working well
Hey @leochr just checking if you've made any headway with this?
@lancechant Sorry for the delayed response. I missed the notification for your inquiry.
I investigated the logs but I couldn't find anything that could explain the.
Please share more details about the health check. Thank you.
docker run
command used with --health-cmd
? Could you please provide the full command (in particular the command options for starting container and health check)? --health-start-period
allows to delay the initial health check, so that health checks are not performed pre-maturely.06:22:50.845 [Default Executor-thread-17] DEBUG z.c.s.t.a.h.b.JmsBrokerHealthCheck - Scanning for supported message brokers...
06:22:50.847 [Default Executor-thread-17] DEBUG z.c.s.t.a.h.b.JmsBrokerHealthCheck - Found the following JMS broker endpoints: [active-mq-broker]
06:22:50.854 [Default Executor-thread-37] DEBUG z.c.s.t.a.h.b.DefaultHealthCheckSwitch - Using the default enabler
06:22:50.854 [Default Executor-thread-2] DEBUG z.c.s.t.a.h.b.DefaultHealthCheckSwitch - Using the default enabler
06:22:50.854 [Default Executor-thread-17] DEBUG z.c.s.t.a.h.b.DefaultHealthCheckSwitch - Using the default enabler
06:22:50.854 [Default Executor-thread-39] DEBUG z.c.s.t.a.h.b.DefaultHealthCheckSwitch - Using the default enabler
06:22:50.945 [Default Executor-thread-39] DEBUG z.c.s.t.p.p.j.c.DBController - Executing a JPortal function within za.co.sbg.App.message.obc.entity.persistence.ObcToggles
06:22:50.945 [Default Executor-thread-17] DEBUG z.c.s.t.p.p.j.c.DBController - Executing a JPortal function within za.co.sbg.App.message.obc.entity.persistence.ObcToggles
06:22:50.949 [Default Executor-thread-17] DEBUG z.c.s.t.a.h.b.JmsBrokerHealthCheck - All the health check switches are in the on position
06:22:50.949 [Default Executor-thread-39] DEBUG z.c.s.t.a.h.b.JmsBrokerHealthCheck - All the health check switches are in the on position
Hi @leochr thanks for getting back, no worries I'm sure you have many things to work on.
@Slf4j
@Liveness
@Readiness
@ApplicationScoped
public class JmsBrokerHealthCheck implements HealthCheck {
@Inject
private Configuration platformConfiguration;
@Inject
private DynamicResourceProvider dynamicResourceProvider;
@Inject
@JmsBrokerSwitches
private Instance<HealthCheckSwitch> jmsBrokerHealthCheckSwitchInstances;//There should always be at least one of these.
private Set<String> brokerJndiSet;
@PostConstruct
public void initialize() {
log.debug("Scanning for supported message brokers...");
brokerJndiSet = platformConfiguration.getAllMessageRoutes().stream()
.filter(r -> r.getMessageTransport() == null || JMS.equalsIgnoreCase(r.getMessageTransport()))
.map(Endpoint.Route::getEndpointName)
.collect(Collectors.toSet());
log.debug("Found the following JMS broker endpoints: {}", brokerJndiSet);
}
@Override
public HealthCheckResponse call() {
if (shouldCheck()) {
return checkBrokers();
} else {
return HealthCheckResponse.named(getClass().getSimpleName())
.withData("Broker Check Off", "JMS health checks are disabled, Status forced to UP")
.up().build();
}
}
private boolean shouldCheck() {
Optional<HealthCheckSwitch> firstOffSwitch = jmsBrokerHealthCheckSwitchInstances.stream()
.filter(healthCheckSwitch -> !healthCheckSwitch.shouldCheck())//Essentially ANDing them all together
.findFirst();
if (firstOffSwitch.isEmpty()) {
log.debug("All the health check switches are in the on position");
return true;
} else {
log.debug("At least one health check switch is in the off position");
return false;
}
}
private HealthCheckResponse checkBrokers() {
HealthCheckResponseBuilder healthCheckResponseBuilder = HealthCheckResponse.named(getClass().getSimpleName());
boolean allBrokersAreValid = true;
for (String jmsJndi : brokerJndiSet) {
try {
ConnectionFactory connectionFactory =
dynamicResourceProvider.lookup(
JmsMessageProducer.CONNECTION_FACTORY_PREFIX + jmsJndi);
try (var conn = connectionFactory.createConnection()) {
healthCheckResponseBuilder = healthCheckResponseBuilder
.withData(jmsJndi, "Successfully connected to broker");
}
} catch (Exception e) {
allBrokersAreValid = false;
healthCheckResponseBuilder = healthCheckResponseBuilder
.withData(jmsJndi,
"Connection to broker was unsuccessful: " + e.getMessage());
log.error("Error connecting to broker "+jmsJndi,e);
}
}
if (allBrokersAreValid) {
return healthCheckResponseBuilder.up().build();
}
return healthCheckResponseBuilder.down().build();
}
}
Hey @leochr a small observation I noticed, and not to sure how the OL docker image works.. but I say it showed the defaultServer
started, but then a few lines later says the actual application we copied over hasn't started yet, this is when the health check of just up
shows, so is there something we missing in the docker container creation? below is the example docker file we have
FROM open-liberty:24.0.0.2-kernel-slim-java17-openj9
ENV KEYSTORE_REQUIRED=false VERBOSE=true
EXPOSE 8080 8543
COPY --chown=1001 dependencies /opt/ol/wlp/usr/
COPY --chown=1001 maven /config/
COPY --chown=1001 keystore /var/keystore/
RUN features.sh
RUN configure.sh
Hi
We have a server on an instance that has limited resources so it takes a bit to start up, and we have noticed we get healthchecks that "pass" and say the server is up, when in fact it's not our server, and it comes up later.
The docker image passes health checks here:
[AUDIT ] CWWKF0012I: The server installed the following features: [appSecurity-5.0, beanValidation-3.0, cdi-4.0, connectors-2.1, distributedMap-1.0, enterpriseBeansLite-4.0, enterpriseBeansPersistentTimer-4.0, expressionLanguage-5.0, jdbc-4.3, jndi-1.0, json-1.0, jsonb-3.0, jsonp-2.1, jwt-1.0, localConnector-1.0, mdb-4.0, messaging-3.1, microProfile-6.1, monitor-1.0, mpConfig-3.1, mpFaultTolerance-4.0, mpHealth-4.0, mpJwt-2.1, mpMetrics-5.1, mpOpenAPI-3.1, mpRestClient-3.0, mpTelemetry-1.1, restfulWS-3.1, restfulWSClient-3.1, servlet-6.0, ssl-1.0, transportSecurity-1.0]. [AUDIT ] CWWKF0011I: The defaultServer server is ready to run a smarter planet. The defaultServer server started in 43.494 seconds
But we have a log line a few lines below that showing the application is not started yet
[AUDIT ] CWWKZ0022W: Application test-app has not started in 30.000 seconds.
At this point the docker image passes health checks while the server is still starting up, we also have custom health checks in the server, but those are not triggered at this point, as the only response we get is status UP, then it will go to down status and back to showing us our custom health check responses.
Is this intended behaviour? and what can we do to get around this?