quarkusio / quarkus

Quarkus: Supersonic Subatomic Java.
https://quarkus.io
Apache License 2.0
13.73k stars 2.67k forks source link

Kogito Dev Services raise exception with podman 4.4.2 #31652

Closed dmarrazzo closed 1 year ago

dmarrazzo commented 1 year ago

Describe the bug

OpenShift Serverless Logic has it's own dev services: Quarkus project: https://github.com/dmarrazzo/order-swf

Full tutorial: https://redhat-scholars.github.io/serverless-workflow

Starting quarkus in dev mode trigger the kogito dev services:

Expected behavior

dev services should be started without issues.

Actual behavior

Exceptions are thrown, e.g.:

2023-03-07 12:28:28,427 ERROR [io.qua.dep.dev.IsolatedDevModeMain] (main) Failed to start quarkus: java.lang.RuntimeException: io.quarkus.builder.BuildException: Build failure: Build failed due to errors
    [error]: Build step org.kie.kogito.quarkus.processes.deployment.KogitoDevServicesProcessor#startDataIndexDevService threw an exception: java.lang.IllegalStateException: java.io.IOException: There was a problem while connecting to localhost:38155
    at io.quarkus.deployment.ExtensionLoader$2.execute(ExtensionLoader.java:891)
    at io.quarkus.builder.BuildContext.run(BuildContext.java:277)
    at org.jboss.threads.ContextHandler$1.runWith(ContextHandler.java:18)
    at org.jboss.threads.EnhancedQueueExecutor$Task.run(EnhancedQueueExecutor.java:2449)
    at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1478)
    at java.base/java.lang.Thread.run(Thread.java:833)
    at org.jboss.threads.JBossThread.run(JBossThread.java:501)
Caused by: java.io.IOException: There was a problem while connecting to localhost:38155
    at org.testcontainers.shaded.com.trilead.ssh2.Connection.connect(Connection.java:755)
    at org.testcontainers.containers.PortForwardingContainer.createSSHSession(PortForwardingContainer.java:47)
    at org.testcontainers.containers.PortForwardingContainer.getSshConnection(PortForwardingContainer.java:27)
    at org.testcontainers.containers.PortForwardingContainer.exposeHostPort(PortForwardingContainer.java:68)
    at org.testcontainers.containers.PortForwardingContainer.exposeHostPort(PortForwardingContainer.java:62)
    at org.testcontainers.Testcontainers.exposeHostPorts(Testcontainers.java:15)
    at org.kie.kogito.quarkus.processes.deployment.KogitoDevServicesProcessor.startDataIndexDevService(KogitoDevServicesProcessor.java:130)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:568)
    at io.quarkus.deployment.ExtensionLoader$2.execute(ExtensionLoader.java:882)
    ... 6 more
Caused by: java.io.IOException: Key exchange was not finished, connection is closed.
    at org.testcontainers.shaded.com.trilead.ssh2.transport.KexManager.getOrWaitForConnectionInfo(KexManager.java:93)
    at org.testcontainers.shaded.com.trilead.ssh2.transport.TransportManager.getConnectionInfo(TransportManager.java:231)
    at org.testcontainers.shaded.com.trilead.ssh2.Connection.connect(Connection.java:711)
    ... 17 more
Caused by: java.io.IOException: Cannot read full block, EOF reached.
    at org.testcontainers.shaded.com.trilead.ssh2.crypto.cipher.CipherInputStream.getBlock(CipherInputStream.java:81)
    at org.testcontainers.shaded.com.trilead.ssh2.crypto.cipher.CipherInputStream.read(CipherInputStream.java:108)
    at org.testcontainers.shaded.com.trilead.ssh2.transport.TransportConnection.receiveMessage(TransportConnection.java:232)
    at org.testcontainers.shaded.com.trilead.ssh2.transport.TransportManager.receiveLoop(TransportManager.java:672)
    at org.testcontainers.shaded.com.trilead.ssh2.transport.TransportManager$1.run(TransportManager.java:470)
    at java.base/java.lang.Thread.run(Thread.java:833)

    at io.quarkus.runner.bootstrap.AugmentActionImpl.runAugment(AugmentActionImpl.java:330)
    at io.quarkus.runner.bootstrap.AugmentActionImpl.createInitialRuntimeApplication(AugmentActionImpl.java:252)
    at io.quarkus.runner.bootstrap.AugmentActionImpl.createInitialRuntimeApplication(AugmentActionImpl.java:60)
    at io.quarkus.deployment.dev.IsolatedDevModeMain.firstStart(IsolatedDevModeMain.java:92)
    at io.quarkus.deployment.dev.IsolatedDevModeMain.accept(IsolatedDevModeMain.java:455)
    at io.quarkus.deployment.dev.IsolatedDevModeMain.accept(IsolatedDevModeMain.java:66)
    at io.quarkus.bootstrap.app.CuratedApplication.runInCl(CuratedApplication.java:139)
    at io.quarkus.bootstrap.app.CuratedApplication.runInAugmentClassLoader(CuratedApplication.java:95)
    at io.quarkus.deployment.dev.DevModeMain.start(DevModeMain.java:132)
    at io.quarkus.deployment.dev.DevModeMain.main(DevModeMain.java:62)
Caused by: io.quarkus.builder.BuildException: Build failure: Build failed due to errors
    [error]: Build step org.kie.kogito.quarkus.processes.deployment.KogitoDevServicesProcessor#startDataIndexDevService threw an exception: java.lang.IllegalStateException: java.io.IOException: There was a problem while connecting to localhost:38155
    at io.quarkus.deployment.ExtensionLoader$2.execute(ExtensionLoader.java:891)
    at io.quarkus.builder.BuildContext.run(BuildContext.java:277)
    at org.jboss.threads.ContextHandler$1.runWith(ContextHandler.java:18)
    at org.jboss.threads.EnhancedQueueExecutor$Task.run(EnhancedQueueExecutor.java:2449)
    at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1478)
    at java.base/java.lang.Thread.run(Thread.java:833)
    at org.jboss.threads.JBossThread.run(JBossThread.java:501)
Caused by: java.io.IOException: There was a problem while connecting to localhost:38155
    at org.testcontainers.shaded.com.trilead.ssh2.Connection.connect(Connection.java:755)
    at org.testcontainers.containers.PortForwardingContainer.createSSHSession(PortForwardingContainer.java:47)
    at org.testcontainers.containers.PortForwardingContainer.getSshConnection(PortForwardingContainer.java:27)
    at org.testcontainers.containers.PortForwardingContainer.exposeHostPort(PortForwardingContainer.java:68)
    at org.testcontainers.containers.PortForwardingContainer.exposeHostPort(PortForwardingContainer.java:62)
    at org.testcontainers.Testcontainers.exposeHostPorts(Testcontainers.java:15)
    at org.kie.kogito.quarkus.processes.deployment.KogitoDevServicesProcessor.startDataIndexDevService(KogitoDevServicesProcessor.java:130)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:568)
    at io.quarkus.deployment.ExtensionLoader$2.execute(ExtensionLoader.java:882)
    ... 6 more
Caused by: java.io.IOException: Key exchange was not finished, connection is closed.
    at org.testcontainers.shaded.com.trilead.ssh2.transport.KexManager.getOrWaitForConnectionInfo(KexManager.java:93)
    at org.testcontainers.shaded.com.trilead.ssh2.transport.TransportManager.getConnectionInfo(TransportManager.java:231)
    at org.testcontainers.shaded.com.trilead.ssh2.Connection.connect(Connection.java:711)
    ... 17 more
Caused by: java.io.IOException: Cannot read full block, EOF reached.
    at org.testcontainers.shaded.com.trilead.ssh2.crypto.cipher.CipherInputStream.getBlock(CipherInputStream.java:81)
    at org.testcontainers.shaded.com.trilead.ssh2.crypto.cipher.CipherInputStream.read(CipherInputStream.java:108)
    at org.testcontainers.shaded.com.trilead.ssh2.transport.TransportConnection.receiveMessage(TransportConnection.java:232)
    at org.testcontainers.shaded.com.trilead.ssh2.transport.TransportManager.receiveLoop(TransportManager.java:672)
    at org.testcontainers.shaded.com.trilead.ssh2.transport.TransportManager$1.run(TransportManager.java:470)
    at java.base/java.lang.Thread.run(Thread.java:833)

    at io.quarkus.builder.Execution.run(Execution.java:116)
    at io.quarkus.builder.BuildExecutionBuilder.execute(BuildExecutionBuilder.java:79)
    at io.quarkus.deployment.QuarkusAugmentor.run(QuarkusAugmentor.java:157)
    at io.quarkus.runner.bootstrap.AugmentActionImpl.runAugment(AugmentActionImpl.java:328)
    ... 9 more
Caused by: java.lang.IllegalStateException: java.io.IOException: There was a problem while connecting to localhost:38155
    at io.quarkus.deployment.ExtensionLoader$2.execute(ExtensionLoader.java:891)
    at io.quailder.BuildContext.run(BuildContext.java:277)
    at org.jboss.threads.ContextHandler$1.runWith(ContextHandler.java:18)
    at org.jboss.threads.EnhancedQueueExecutor$Task.run(EnhancedQueueExecutor.java:2449)
    at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1478)
    at java.base/java.lang.Thread.run(Thread.java:833)
    at org.jboss.threads.JBossThread.run(JBossThread.java:501)
Caused by: java.io.IOException: There was a problem while connecting to localhost:38155
    at org.testcontainers.shaded.com.trilead.ssh2.Connection.connect(Connection.java:755)
    at org.testcontainers.containers.PortForwardingContainer.createSSHSession(PortForwardingContainer.java:47)
    at org.testcontainers.containers.PortForwardingContainer.getSshConnection(PortForwardingContainer.java:27)
    at org.testcontainers.containers.PortForwardingContainer.exposeHostPort(PortForwardingContainer.java:68)
    at org.testcontainers.containers.PortForwardingContainer.exposeHostPort(PortForwardingContainer.java:62)
    at org.testcontainers.Testcontainers.exposeHostPorts(Testcontainers.java:15)
    at org.kie.kogito.quarkus.processes.deployment.KogitoDevServicesProcessor.startDataIndexDevService(KogitoDevServicesProcessor.java:130)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:568)
    at io.quarkus.deployment.ExtensionLoader$2.execute(ExtensionLoader.java:882)
    ... 6 more
Caused by: java.io.IOException: Key exchange was not finished, connection is closed.
    at org.testcontainers.shaded.com.trilead.ssh2.transport.KexManager.getOrWaitForConnectionInfo(KexManager.java:93)
    at org.testcontainers.shaded.com.trilead.ssh2.transport.TransportManager.getConnectionInfo(TransportManager.java:231)
    at org.testcontainers.shaded.com.trilead.ssh2.Connection.connect(Connection.java:711)
    ... 17 more
Caused by: java.io.IOException: Cannot read full block, EOF reached.
    at org.testcontainers.shaded.com.trilead.ssh2.crypto.cipher.CipherInputStream.getBlock(CipherInputStream.java:81)
    at org.testcontainers.shaded.com.trilead.ssh2.crypto.cipher.CipherInputStream.read(CipherInputStream.java:108)
    at org.testcontainers.shaded.com.trilead.ssh2.transport.TransportConnection.receiveMessage(TransportConnection.java:232)
    at org.testcontainers.shaded.com.trilead.ssh2.transport.TransportManager.receiveLoop(TransportManager.java:672)
    at org.testcontainers.shaded.com.trilead.ssh2.transport.TransportManager$1.run(TransportManager.java:470)
    at java.base/java.lang.Thread.run(Thread.java:833)

How to Reproduce?

  1. configure podman 4.4.2 https://quarkus.io/guides/podman
  2. clone the repo https://github.com/dmarrazzo/order-swf
  3. launch quarkus in dev mode

Output of uname -a or ver

Linux mas-p1 6.1.14-200.fc37.x86_64 #1 SMP PREEMPT_DYNAMIC Sun Feb 26 00:13:26 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

Output of java -version

openjdk version "11.0.18" 2023-01-17 OpenJDK Runtime Environment (Red_Hat-11.0.18.0.10-1.fc37) (build 11.0.18+10) OpenJDK 64-Bit Server VM (Red_Hat-11.0.18.0.10-1.fc37) (build 11.0.18+10, mixed mode, sharing)

GraalVM version (if different from Java)

No response

Quarkus version or git rev

2.7.6.Final-redhat-00011

Build tool (ie. output of mvnw --version or gradlew --version)

Maven home: /usr/share/maven Java version: 17.0.6, vendor: Red Hat, Inc., runtime: /usr/lib/jvm/java-17-openjdk-17.0.6.0.10-1.fc37.x86_64 Default locale: en_US, platform encoding: UTF-8 OS name: "linux", version: "6.1.14-200.fc37.x86_64", arch: "amd64", family: "unix"

Additional information

No response

quarkus-bot[bot] commented 1 year ago

/cc @evacchi (kogito), @geoand (devservices,openshift), @iocanel (openshift), @mariofusco (kogito), @stuartwdouglas (devservices)

gsmet commented 1 year ago

Is it a problem only for the Kogito Dev Services or something more generic with Dev Services?

You don't have anything else in the log about why the container didn't start? Either in Quarkus or in Podman?

dmarrazzo commented 1 year ago

At the moment, I've spotted this behavior only for Kogito dev services, whereas Kafka and Postgres works pretty well. I've noticed that with the latest version of the container and kogito libraries, it has a weird behavior:

Can you provide instructions to gather more information?

E.g. Increasing the log level on quarkus side and or on podman side?

Thank you

TheHaf commented 1 year ago

As the stacktrace shows a problem at "org.testcontainers.shaded.com.trilead.ssh2.*", could it be that you are using the Testcontainers.exposeHostPorts(...) method call? I noticed that causes problems using a current podman version, but I haven't found out the reason yet. It seems that method causes Testcontainers to start a SSHD container to expose the given port to the container using reverse port forwarding.

cristianonicolai commented 1 year ago

@TheHaf correct, exposeHostPorts is used as the container needs to send requests to the running Quarkus application. See https://github.com/kiegroup/kogito-runtimes/blob/main/quarkus/extensions/kogito-quarkus-workflow-extension-common/kogito-quarkus-workflow-common-deployment/src/main/java/org/kie/kogito/quarkus/workflow/deployment/AbstractDevServicesProcessor.java#L90 Is there any known workaround?

cristianonicolai commented 1 year ago

potentially related and fixed by https://github.com/testcontainers/testcontainers-java/issues/2088 and https://github.com/testcontainers/testcontainers-java/issues/5482

TheHaf commented 1 year ago

The second issue provides a hint regarding the used SSH client lib (https://github.com/testcontainers/testcontainers-java/issues/5482#issuecomment-1153241155).

I did a bit more digging myself and I suspect that the SSH2 library that Testcontainers shades and uses internally (com.trilead ssh2) causes the issue ultimately. That library has not been updated for the past six years apparently. The Jenkins CI project also makes use of the library, but they have their own fork that they maintain.

My guess is that the outdated SSH2 library is incompatible in some way to the current podman version. I haven't had the time yet to investigate further unfortunately. Ultimately if my guess is correct, the way forward would probably be to replace the outdated ssh library in Testcontainers. One alternative that I saw being mentioned somewhere is apparently being maintaned by the Apache folks.

TheHaf commented 1 year ago

I took another look into this problem today. Turns out, this has nothing to do with the SSH implementation in testcontainers.

First i tried to manually run the sshd container with podman and with debugging enabled. I encountered a suspisious log message chroot("/var/empty"): Operation not permitted [preauth].

After some more research, this brought me to this blog post, explaining that, among others, the capability SYS_CHROOT has been dropped as a default in podman release 4.4.0.

According to the release notes for version 4.4.3, this capability has been added back to the defaults already. So updating to that version of podman or a newer one should make things work again.

I do not have the option to update podman manually on my workstation and am stuck with a version that doesn't contain the necessary default until the distribution maintainers update their repository. The question is, what are the options in such a case? Here are some options:

  1. When running a container manually that needs this capability, you can add the option --cap-add SYS_CHROOT to the podman run command line.

  2. Add the capability to the default_capabilities config in /etc/containers/containers.conf (this is what I did) to have it on all containers:

    default_capabilities = [
    "CHOWN",
    "DAC_OVERRIDE",
    "FOWNER",
    "FSETID",
    "KILL",
    "NET_BIND_SERVICE",
    "SETFCAP",
    "SETGID",
    "SETPCAP",
    "SETUID",
    "SYS_CHROOT",
    ]

There might be other solutions, like using a patched version of testcontainers-core that adds the capability manually (using withCreateContainerCmdModifier(cmd -> cmd.getHostConfig().withCapAdd(Capability.SYS_CHROOT))) in the enum PortForwardingContainer.

Hope that helps.

Note: Apparently it should also be possible to add the capability when building a container using podman build, but for some reason that didn't change a thing for me. Maybe that functionality is currently broken.

dmarrazzo commented 1 year ago

Hi @TheHaf thank you very much for the detailed investigation. I can confirm that the sanity is restored with podman 4.4.3!!! Let's hope that nobody decide to drop it again at next iteration of podman!!! Do you think there is a way to make podman team aware that we rely on that feature?

TheHaf commented 1 year ago

Sorry for not reacting sooner @dmarrazzo.

Well, the podman version 4.5.0 release notes also state that this capability has been re-added, so I would assume that they are going to keep it at least for the forseeable future.

I don't know what the best way to contact the podman maintainers directly would be, but there is an overview of official community channels on the project website: https://podman.io/community

You should be able to reach them through one of the listed channels. Hope that helps.

dmarrazzo commented 1 year ago

thank you @TheHaf Closing issue...