Closed maflynn closed 4 years ago
i'm having the same exact issue, tried a few different tags as well
I also encountered this WARNING (using postgres, but got a similar message logged). The problem was something else, this was just the last WARNING logged: it's also logged when all works correctly, even when you are not running Keycloak in a cluster.
It happens because the official Keycloak image recipe does not check if the driver was already registered or not, it simply tries to register it on each startup. See https://github.com/keycloak/keycloak-containers/blob/ccde71f8931ac0c1c216c9b1e61dfe326018a53b/server/tools/cli/databases/mariadb/change-database.cli
The startup issue might be due to a custom startup script for example (if you have one). If you try to configure a logger from a startup script and you are not checking if it exists already or not but already create it, you might get into a similar situation.
Correction to my previous comment: indeed the JDBC_PING.cli script contains some commands which fail to execute when being re-ran (they actually run on each restart). If you remove lines:
/subsystem=jgroups/stack=udp:remove()
and
/socket-binding-group=standard-sockets/socket-binding=jgroups-mping:remove()
it will start up correctly subsequent times as well.
I tried to execute them conditionally using conditions like
if (outcome == success) of /subsystem=jgroups/stack=udp::read-resource()
/subsystem=jgroups/stack=udp:remove()
end-if
but this did not work in this context. Another attempt was wrapping these 2 lines in a try-catch-end-try block, but that did not work either. Removing the 2 lines do not seem to cause harm.
Thanks @maflynn , @kplantjr and @laszlomiklosik for your comments.
I will try to have a look at it as soon as I get some free time.
Hi, I was able to reproduce the issue. Indeed, once we have, for instance, 2 running Keycloak instances using JDBC_PING
discovery protocol and one of them restarts, this one cannot join again.
After several tests using MySQL
, MariaDB
and Postgres
, I came to a conclusion (based on @laszlomiklosik suggestion, thanks for that) we don't need /subsystem=jgroups/stack=udp:remove()
and /socket-binding-group=standard-sockets/socket-binding=jgroups-mping:remove()
. I remove them and now everything looks ok. No problem on restarting anymore.
Besides, I realized that the command for creating the JGROUPSPING
was correct for MySQL
and MariaDB
, but didn't work for Postgres
. Because of that, I've created (in version 11.0.2
) different JDBC_PING
cli files for each database.
Thanks for the update. The solution I came up with meanwhile:
embed-server --server-config=standalone-ha.xml --std-out=echo
if (outcome != success) of /subsystem=logging/logger=org.infinispan.CLUSTER:read-resource()
/subsystem=logging/logger=org.infinispan.CLUSTER:add(level=INFO)
end-if
batch
/subsystem=infinispan/cache-container=keycloak/distributed-cache=sessions:write-attribute(name=owners, value=${env.CACHE_OWNERS:2})
/subsystem=infinispan/cache-container=keycloak/distributed-cache=authenticationSessions:write-attribute(name=owners, value=${env.CACHE_OWNERS:2})
/subsystem=infinispan/cache-container=keycloak/distributed-cache=offlineSessions:write-attribute(name=owners, value=${env.CACHE_OWNERS:2})
/subsystem=infinispan/cache-container=keycloak/distributed-cache=loginFailures:write-attribute(name=owners, value=${env.CACHE_OWNERS:2})
/subsystem=infinispan/cache-container=keycloak/distributed-cache=actionTokens:write-attribute(name=owners, value=${env.CACHE_OWNERS:2})
/subsystem=infinispan/cache-container=keycloak/distributed-cache=clientSessions:write-attribute(name=owners, value=${env.CACHE_OWNERS:2})
/subsystem=infinispan/cache-container=keycloak/distributed-cache=offlineClientSessions:write-attribute(name=owners, value=${env.CACHE_OWNERS:2})
/subsystem=jgroups/stack=tcp:remove()
/subsystem=jgroups/stack=tcp:add()
/subsystem=jgroups/stack=tcp/transport=TCP:add(socket-binding="jgroups-tcp")
/subsystem=jgroups/stack=tcp/protocol=JDBC_PING:add(add-index=0, properties=$keycloak_jgroups_discovery_protocol_properties)
/subsystem=jgroups/stack=tcp/protocol=MERGE3:add()
/subsystem=jgroups/stack=tcp/protocol=FD_SOCK:add(socket-binding="jgroups-tcp-fd")
/subsystem=jgroups/stack=tcp/protocol=FD:add()
/subsystem=jgroups/stack=tcp/protocol=VERIFY_SUSPECT:add()
/subsystem=jgroups/stack=tcp/protocol=pbcast.NAKACK2:add()
/subsystem=jgroups/stack=tcp/protocol=UNICAST3:add()
/subsystem=jgroups/stack=tcp/protocol=pbcast.STABLE:add()
/subsystem=jgroups/stack=tcp/protocol=pbcast.GMS:add()
/subsystem=jgroups/stack=tcp/protocol=pbcast.GMS/property=max_join_attempts:add(value=5)
/subsystem=jgroups/stack=tcp/protocol=MFC:add()
/subsystem=jgroups/stack=tcp/protocol=FRAG3:add()
/subsystem=jgroups/channel=ee:write-attribute(name=stack, value=tcp)
run-batch
if (outcome == success) of /subsystem=jgroups/stack=udp/protocol=PING:read-resource()
/subsystem=jgroups/stack=udp/protocol=PING:remove()
end-if
if (outcome == success) of /subsystem=jgroups/stack=tcp/protocol=MPING:read-resource()
/subsystem=jgroups/stack=tcp/protocol=MPING:remove()
end-if
try
:resolve-expression(expression=${env.JGROUPS_DISCOVERY_EXTERNAL_IP})
/subsystem=jgroups/stack=tcp/transport=TCP/property=external_addr/:add(value=${env.JGROUPS_DISCOVERY_EXTERNAL_IP})
catch
echo "JGROUPS_DISCOVERY_EXTERNAL_IP maybe not set."
end-try
stop-embedded-server
This is the complete JDBC_PING.cli file which I mounted to the official Keycloak image. I took some inspiration from the original keycloak-containers/server/tools/cli/jgroups/discovery/default.cli (and server/tools/jgroups.sh) files which in theory supports JDBC_PING as well, but unfortunately only with local IPs (which does not make much sense as the default multicast ping would also work in case you run everything on the same machine/docker network): it doesn't support registering external IPs.
In docker I specify the following environment variables:
- JGROUPS_DISCOVERY_EXTERNAL_IP=external_ip_goes_here
- JGROUPS_DISCOVERY_PROTOCOL=JDBC_PING
- JGROUPS_DISCOVERY_PROPERTIES=datasource_jndi_name=java:jboss/datasources/KeycloakDS,info_writer_sleep_time=500,initialize_sql="CREATE TABLE IF NOT EXISTS JGROUPSPING ( own_addr varchar(200) NOT NULL, cluster_name varchar(200) NOT NULL, created timestamp default current_timestamp, ping_data BYTEA, constraint PK_JGROUPSPING PRIMARY KEY (own_addr, cluster_name))"
so the table creation query is configurable this way (so I expect that the same cli script works well for all db vendors).
Hey @laszlomiklosik , brilliant solution yours! I will adapt mine.
Btw, I am thinking about reducing a bit of the complexity by hiding the JGROUPSPING
create table SQL, so that the final user doesn't need to provide it. My initial solution was a lazy one and I basically created scripts for (at least 3) DB vendors (it is missing oracle
and mssql
).
However, I believe it's possible to get the DB vendor (that was set as one of the parameters for the docker container, DB_VENDOR
) inside JDBC_PING
script.
When Keycloak starts, it checks the DB selected (https://github.com/keycloak/keycloak-containers/blob/master/server/tools/docker-entrypoint.sh#L231) and runs the change-database.sh
script informing the DB vendor. Depending on the DB, it will run the /cli/databases/<DB>/change-database.cli
script.
For instance, this is the script for MySQL https://github.com/keycloak/keycloak-containers/blob/master/server/tools/cli/databases/mysql/change-database.cli#L2 Maybe, by using read-resource()
of the datasource KeycloakDS
it's possible to get the driver-name
.
In JDBC_PING
script, I've implemented the creation of JGROUPSPING
table as described above.
I am closing this issue. Please, feel free to reopen it in case I can help with something.
A little info on my setup... Have 3 Centos 7 hosts running the latest docker engine from the official docker repository. Running a mariadb galera cluster in containers across all 3 hosts as the shared database for the keycloak cluster. Using the latest keycloak image from jboss/keycloak:latest and the JDBC_PING mod to cluster.
my docker run syntax(ips and passwords removed):
docker run -d -p 8443:8443 -p 7600:7600 -e KEYCLOAK_USER=admin -e KEYCLOAK_PASSWORD=$KC_PASS -e DB_VENDOR=mariadb -e DB_ADDR=$DB_IP -e DB_PORT=32775 -e DB_USER=keycloak -e DB_PASSWORD=$DB_PASS -e DB_DATABASE=keycloak -e JGROUPS_DISCOVERY_EXTERNAL_IP=$EXTERNAL_IP -e JGROUPS_DISCOVERY_PROTOCOL=JDBC_PING -e JGROUPS_DISCOVERY_PROPERTIES=datasource_jndi_name=java:jboss/datasources/KeycloakDS -v /etc/x509/https/tls.crt:/etc/x509/https/tls.crt -v /etc/x509/https/tls.key:/etc/x509/https/tls.key --name keycloak ivanfranchin/keycloak-clustered:latest
When I do the initial docker run command they all start fine. They connect to the database and register themselves in the JGROUPSPING table in the database. I can log in to each one individually and am able to see 3 different sessions being shared between all of them. Everything appears to be working correctly.
If I stop a container (docker stop keycloak) and try to restart it, it will not come back up.
The error i see from the docker logs is:
The batch failed with the following error: : WFLYCTL0062: Composite operation failed and was rolled back. Steps that failed: Step: step-9 Operation: /subsystem=datasources/jdbc-driver=mariadb:add(driver-name=mariadb, driver-module-name=org.mariadb.jdbc, driver-xa-datasource-class-name=org.mariadb.jdbc.MySQLDataSource) Failure: WFLYCTL0212: Duplicate resource [ ("subsystem" => "datasources"), ("jdbc-driver" => "mariadb") ]
It looks like Wildfly is building the mariadb datasource again but I don't know how that could even persist in the container after it's stopped.
If I delete the container(docker rm keycloak) and re-run it, it will start and re-join the cluster.