quarkusio / quarkus

Quarkus: Supersonic Subatomic Java.
https://quarkus.io
Apache License 2.0
13.49k stars 2.6k forks source link

Test failure with podman: Quarkus - JDBC - DB2 - Deployment #33104

Open holly-cummins opened 1 year ago

holly-cummins commented 1 year ago

Describe the bug

I am working on a CI run which uses podman, and it has shown several failing tests. https://github.com/holly-cummins/quarkus/actions/runs/4844611295/jobs/8632987478 is an example run with failures. https://github.com/holly-cummins/quarkus/tree/podman-build has the (messy) workflow code.

Note that I’ve disabled this test when the CI is running with Podman, to keep the build clean. As part of the fix for this, the IS_PODMAN guard should be removed from the pom.xml.

Expected behavior

No response

Actual behavior

2023-04-28T20:44:31.7644277Z 2023-04-28 20:44:31,755 WARN  [io.agr.pool] (agroal-11) Datasource '<default>': [jcc][t4][2030][11211][4.32.28] A communication error occurred during operations on the connection's underlying socket, socket input stream, 
2023-04-28T20:44:31.7645161Z or socket output stream.  Error location: Reply.fill() - socketInputStream.read (-1).  Message: Connection reset. ERRORCODE=-4499, SQLSTATE=08001
2023-04-28T20:44:31.8647604Z 2023-04-28 20:44:31,782 INFO  [io.quarkus] (main) quarkus-jdbc-db2-deployment stopped in 0.013s
2023-04-28T20:44:32.2052308Z [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 96.469 s <<< FAILURE! - in io.quarkus.jdbc.db2.deployment.DevServicesDB2DatasourceTestCase
2023-04-28T20:44:32.2054446Z [ERROR] io.quarkus.jdbc.db2.deployment.DevServicesDB2DatasourceTestCase.testDatasource  Time elapsed: 0.713 s  <<< ERROR!
2023-04-28T20:44:32.2055196Z com.ibm.db2.jcc.am.DisconnectNonTransientConnectionException: 
2023-04-28T20:44:32.2056354Z [jcc][t4][2030][11211][4.32.28] A communication error occurred during operations on the connection's underlying socket, socket input stream, 
2023-04-28T20:44:32.2057099Z or socket output stream.  Error location: Reply.fill() - socketInputStream.read (-1).  Message: Connection reset. ERRORCODE=-4499, SQLSTATE=08001
2023-04-28T20:44:32.2057508Z    at com.ibm.db2.jcc.am.b7.a(b7.java:338)
2023-04-28T20:44:32.2057794Z    at com.ibm.db2.jcc.t4.a.a(a.java:577)
2023-04-28T20:44:32.2058074Z    at com.ibm.db2.jcc.t4.a.a(a.java:561)
2023-04-28T20:44:32.2058326Z    at com.ibm.db2.jcc.t4.a.a(a.java:556)
2023-04-28T20:44:32.2058578Z    at com.ibm.db2.jcc.t4.a.b(a.java:1727)
2023-04-28T20:44:32.2058836Z    at com.ibm.db2.jcc.t4.a.a(a.java:1752)
2023-04-28T20:44:32.2059082Z    at com.ibm.db2.jcc.t4.y.b(y.java:276)
2023-04-28T20:44:32.2059319Z    at com.ibm.db2.jcc.t4.y.c(y.java:342)
2023-04-28T20:44:32.2059564Z    at com.ibm.db2.jcc.t4.y.c(y.java:455)
2023-04-28T20:44:32.2059805Z    at com.ibm.db2.jcc.t4.y.v(y.java:1230)
2023-04-28T20:44:32.2060062Z    at com.ibm.db2.jcc.t4.z.a(z.java:53)
2023-04-28T20:44:32.2060315Z    at com.ibm.db2.jcc.t4.b.c(b.java:1489)
2023-04-28T20:44:32.2060843Z    at com.ibm.db2.jcc.t4.b.b(b.java:1358)
2023-04-28T20:44:32.2061091Z    at com.ibm.db2.jcc.t4.b.b(b.java:891)
2023-04-28T20:44:32.2061338Z    at com.ibm.db2.jcc.t4.b.a(b.java:862)
2023-04-28T20:44:32.2061589Z    at com.ibm.db2.jcc.t4.b.a(b.java:457)
2023-04-28T20:44:32.2061841Z    at com.ibm.db2.jcc.t4.b.a(b.java:430)
2023-04-28T20:44:32.2062098Z    at com.ibm.db2.jcc.t4.b.<init>(b.java:368)
2023-04-28T20:44:32.2062497Z    at com.ibm.db2.jcc.DB2SimpleDataSource.getConnection(DB2SimpleDataSource.java:243)
2023-04-28T20:44:32.2062972Z    at com.ibm.db2.jcc.DB2SimpleDataSource.getConnection(DB2SimpleDataSource.java:200)
2023-04-28T20:44:32.2063382Z    at com.ibm.db2.jcc.DB2Driver.connect(DB2Driver.java:491)
2023-04-28T20:44:32.2063722Z    at com.ibm.db2.jcc.DB2Driver.connect(DB2Driver.java:117)
2023-04-28T20:44:32.2064136Z    at io.agroal.pool.ConnectionFactory.createConnection(ConnectionFactory.java:226)
2023-04-28T20:44:32.2064596Z    at io.agroal.pool.ConnectionPool$CreateConnectionTask.call(ConnectionPool.java:536)
2023-04-28T20:44:32.2065043Z    at io.agroal.pool.ConnectionPool$CreateConnectionTask.call(ConnectionPool.java:517)
2023-04-28T20:44:32.2065462Z    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
2023-04-28T20:44:32.2065956Z    at io.agroal.pool.util.PriorityScheduledExecutor.beforeExecute(PriorityScheduledExecutor.java:75)
2023-04-28T20:44:32.2066595Z    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
2023-04-28T20:44:32.2067093Z    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
2023-04-28T20:44:32.2067481Z    at java.base/java.lang.Thread.run(Thread.java:833)
2023-04-28T20:44:32.2067800Z Caused by: java.net.SocketException: Connection reset
2023-04-28T20:44:32.2068168Z    at java.base/sun.nio.ch.NioSocketImpl.implRead(NioSocketImpl.java:323)
2023-04-28T20:44:32.2068630Z    at java.base/sun.nio.ch.NioSocketImpl.read(NioSocketImpl.java:350)
2023-04-28T20:44:32.2069006Z    at java.base/sun.nio.ch.NioSocketImpl$1.read(NioSocketImpl.java:803)
2023-04-28T20:44:32.2069424Z    at java.base/java.net.Socket$SocketInputStream.read(Socket.java:966)
2023-04-28T20:44:32.2069726Z    at com.ibm.db2.jcc.t4.y.b(y.java:230)
2023-04-28T20:44:32.2069942Z    ... 23 more
2023-04-28T20:44:32.2070401Z 
2023-04-28T20:44:32.2452886Z [INFO] 
2023-04-28T20:44:32.2453612Z [INFO] Results:
2023-04-28T20:44:32.2454056Z [INFO] 
2023-04-28T20:44:32.2454597Z [ERROR] Errors: 
2023-04-28T20:44:32.2455655Z [ERROR]   DevServicesDB2DatasourceTestCase.testDatasource » DisconnectNonTransientConnection [jcc][t4][2030][11211][4.32.28] A communication error occurred during operations on the connection's underlying socket, socket input stream, 
2023-04-28T20:44:32.2456700Z or socket output stream.  Error location: Reply.fill() - socketInputStream.read (-1).  Message: Connection reset. ERRORCODE=-4499, SQLSTATE=08001
2023-04-28T20:44:32.2523397Z [INFO] 
2023-04-28T20:44:32.2599287Z [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0
2023-04-28T20:44:32.2601290Z [INFO] 
2023-04-28T20:44:32.2602444Z [INFO] 

How to Reproduce?

Reproduction note: I haven't been able to reproduce this locally on mac with podman 4.03. With the latest mac version of podman 4.4.4, I get a permission failure running rootless, and it passes rootful.

Picking up my branch to reproduce on CI should work, or running locally on linux with a podman engine of 4.5.

Output of uname -a or ver

No response

Output of java -version

No response

GraalVM version (if different from Java)

No response

Quarkus version or git rev

No response

Build tool (ie. output of mvnw --version or gradlew --version)

No response

Additional information

Environment Info

2023-05-02T10:44:48.4887010Z ##[group]Run docker version
2023-05-02T10:44:48.4887318Z �[36;1mdocker version�[0m
2023-05-02T10:44:48.4949771Z shell: /usr/bin/bash -e {0}
2023-05-02T10:44:48.4950031Z env:
2023-05-02T10:44:48.4950252Z   LANG: en_US.UTF-8
2023-05-02T10:44:48.4950613Z   COMMON_MAVEN_ARGS: -e -B --settings .github/mvn-settings.xml --fail-at-end
2023-05-02T10:44:48.4951271Z   NATIVE_TEST_MAVEN_ARGS: -Dtest-containers -Dstart-containers -Dquarkus.native.native-image-xmx=5g -Dnative -Dnative.surefire.skip -Dformat.skip -Dno-descriptor-tests install -DskipDocs
2023-05-02T10:44:48.4952041Z   JVM_TEST_MAVEN_ARGS: -Dtest-containers -Dstart-containers -Dformat.skip -DskipDocs -Dquarkus.test.hang-detection-timeout=60
2023-05-02T10:44:48.4952507Z   DB_USER: hibernate_orm_test
2023-05-02T10:44:48.4952790Z   DB_PASSWORD: hibernate_orm_test
2023-05-02T10:44:48.4953053Z   DB_NAME: hibernate_orm_test
2023-05-02T10:44:48.4953388Z   IS_PODMAN: true
2023-05-02T10:44:48.4953637Z   MAVEN_OPTS: -Xmx2g -XX:MaxMetaspaceSize=1g
2023-05-02T10:44:48.4954244Z   JAVA_HOME: /opt/hostedtoolcache/Java_Temurin-Hotspot_jdk/17.0.6-10/x64
2023-05-02T10:44:48.4954619Z   JAVA_HOME_17_X64: /opt/hostedtoolcache/Java_Temurin-Hotspot_jdk/17.0.6-10/x64
2023-05-02T10:44:48.4954970Z   DOCKER_HOST: unix:///run/user/1001/podman/podman.sock
2023-05-02T10:44:48.4955210Z ##[endgroup]
2023-05-02T10:44:49.5351123Z Client:
2023-05-02T10:44:49.5351708Z  Version:           20.10.24+azure-1
2023-05-02T10:44:49.5351963Z  API version:       1.41
2023-05-02T10:44:49.5352176Z  Go version:        go1.19.6
2023-05-02T10:44:49.5352453Z  Git commit:        297e1284d3bd092e9bc96076c3ddc4bb33f8c7ab
2023-05-02T10:44:49.5352738Z  Built:             Fri Mar 31 18:34:38 UTC 2023
2023-05-02T10:44:49.5352965Z  OS/Arch:           linux/amd64
2023-05-02T10:44:49.5353186Z  Context:           default
2023-05-02T10:44:49.5353414Z  Experimental:      true
2023-05-02T10:44:49.5353560Z 
2023-05-02T10:44:49.5353709Z Server: linux/amd64/ubuntu-22.04
2023-05-02T10:44:49.5353969Z  Podman Engine:
2023-05-02T10:44:49.5354202Z   Version:          4.5.0
2023-05-02T10:44:49.5354439Z   APIVersion:       4.5.0
2023-05-02T10:44:49.5354636Z   Arch:             amd64
2023-05-02T10:44:49.5355016Z   BuildTime:        1970-01-01T00:00:00Z
2023-05-02T10:44:49.5355260Z   Experimental:     false
2023-05-02T10:44:49.5355459Z   GitCommit:        
2023-05-02T10:44:49.5355677Z   GoVersion:        go1.18.1
2023-05-02T10:44:49.5355968Z   KernelVersion:    5.15.0-1036-azure
2023-05-02T10:44:49.5356195Z   MinAPIVersion:    4.0.0
2023-05-02T10:44:49.5356410Z   Os:               linux
2023-05-02T10:44:49.5356604Z  Conmon:
2023-05-02T10:44:49.5356824Z   Version:          conmon version 2.1.7, commit: 
2023-05-02T10:44:49.5357197Z   Package:          conmon_2:2.1.7-0ubuntu22.04+obs15.24_amd64
2023-05-02T10:44:49.5357457Z  OCI Runtime (crun):
2023-05-02T10:44:49.5357678Z   Version:          crun version 1.8.4
2023-05-02T10:44:49.5357962Z commit: 5a8fa99a5e41facba2eda4af12fa26313918805b
2023-05-02T10:44:49.5358237Z rundir: /run/user/1001/crun
2023-05-02T10:44:49.5358444Z spec: 1.0.0
2023-05-02T10:44:49.5358706Z +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +YAJL
2023-05-02T10:44:49.5359097Z   Package:          crun_101:1.8.4-0ubuntu22.04+obs55.10_amd64
2023-05-02T10:44:49.5359334Z  Engine:
2023-05-02T10:44:49.5359516Z   Version:          4.5.0
2023-05-02T10:44:49.5359765Z   API version:      1.41 (minimum version 1.24)
2023-05-02T10:44:49.5360012Z   Go version:       go1.18.1
2023-05-02T10:44:49.5360212Z   Git commit:       
2023-05-02T10:44:49.5360442Z   Built:            Thu Jan  1 00:00:00 1970
2023-05-02T10:44:49.5360673Z   OS/Arch:          linux/amd64
2023-05-02T10:44:49.5360884Z   Experimental:     false
quarkus-bot[bot] commented 1 year ago

/cc @mswatosh (db2)

mswatosh commented 1 year ago

Hi @holly-cummins,

Normally when I see these types of Socket errors from the Db2 JCC driver it indicates an issue at a layer below the application server, either at the OS or network. Given this works on some OS/podman versions, that would make me think it's more likely an issue with podman. Is there something that makes you think this would be a Quarkus issue over a Podman issue?

holly-cummins commented 1 year ago

Hi @mswatosh, I agree it's unlikely to be a Quarkus-specific issue. I've raised this defect to track the triage work we'll need to do to

With several of the other issues we're seeing on the Linux podman CI they don't reproduce locally, so it seems like it might be an issue exposed by the GitHub virtualisation details.

mswatosh commented 1 year ago

That makes sense. I think you may be right about something with Github virtualization. I remember encountering something similar to this when working on an issue for the Db2 reactive driver in vert.x:

https://github.com/eclipse-vertx/vertx-sql-client/pull/1135#issuecomment-1050228717

Similar connection lost issue, that only occurred in the GH Actions build. In that case it was Vert.x + Db2 Reactive Driver accessing Db2 on docker (using Testcontainers). I probably should have reported it to Github, but ended up just working around it.