testcontainers / testcontainers-java

Testcontainers is a Java library that supports JUnit tests, providing lightweight, throwaway instances of common databases, Selenium web browsers, or anything else that can run in a Docker container.
https://testcontainers.org
MIT License
7.96k stars 1.64k forks source link

[Bug]: trino reports ready before the engine is fully started #6310

Open hgschmie opened 1 year ago

hgschmie commented 1 year ago

Module

Trino

Testcontainers version

1.17.6

Using the latest Testcontainers version?

Yes

Host OS

MacOS

Host Arch

x86

Docker version

Docker version 20.10.21, build baeda1f

What happened?

Trino container starts up and reports as "ready". Running a query fails with "No nodes available to run query". Waiting a few seconds before running the query allows it to succeed.

-> The trino container reports "ready" before it is actually ready.

The log below shows that the query that testcontainers uses to check that the engine is ready actually fails with the same error (No nodes available to run query) but testcontainers still contains is ready to use

Relevant log output

[main] INFO org.testcontainers.utility.ImageNameSubstitutor - Image name substitution will be performed by: DefaultImageNameSubstitutor (composite of 'ConfigurationFileImageNameSubstitutor' and 'PrefixingImageNameSubstitutor')
[main] INFO org.testcontainers.dockerclient.DockerClientProviderStrategy - Loaded org.testcontainers.dockerclient.UnixSocketClientProviderStrategy from ~/.testcontainers.properties, will try it first
[main] INFO org.testcontainers.dockerclient.DockerClientProviderStrategy - Found Docker environment with local Unix socket (unix:///var/run/docker.sock)
[main] INFO org.testcontainers.DockerClientFactory - Docker host IP address is localhost
[main] INFO org.testcontainers.DockerClientFactory - Connected to docker: 
  Server Version: 20.10.21
  API Version: 1.41
  Operating System: Docker Desktop
  Total Memory: 7959 MB
[main] INFO 🐳 [testcontainers/ryuk:0.3.4] - Creating container for image: testcontainers/ryuk:0.3.4
[main] INFO 🐳 [testcontainers/ryuk:0.3.4] - Container testcontainers/ryuk:0.3.4 is starting: d6427b5811ee6d435d3ca898da5203afa7e252398b1948e33140b1d3e836acec
[main] INFO 🐳 [testcontainers/ryuk:0.3.4] - Container testcontainers/ryuk:0.3.4 started in PT0.801393S
[main] INFO org.testcontainers.utility.RyukResourceReaper - Ryuk started - will monitor and terminate Testcontainers containers on JVM exit
[main] INFO org.testcontainers.DockerClientFactory - Checking the system...
[main] INFO org.testcontainers.DockerClientFactory - ✔︎ Docker server version should be at least 1.6.0
[main] INFO 🐳 [trinodb/trino:latest] - Creating container for image: trinodb/trino:latest
[main] INFO 🐳 [trinodb/trino:latest] - Container trinodb/trino:latest is starting: 9a5c3abce4ca5728778fdc700fd585f158c1c8f1ad5f8b94648ee4ca57dd7d80
[main] INFO 🐳 [trinodb/trino:latest] - Waiting for database connection to become available at jdbc:trino://localhost:49864/ using query 'SELECT count(*) FROM tpch.tiny.nation'
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: + launcher_opts=(--etc-dir /etc/trino)
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: + grep -s -q node.id /etc/trino/node.properties
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: + launcher_opts+=("-Dnode.id=${HOSTNAME}")
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: + exec /usr/lib/trino/bin/launcher run --etc-dir /etc/trino -Dnode.id=9a5c3abce4ca
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDOUT: # WARNING: Unable to attach Serviceability Agent. You can try again with escalated privileges. Two options: a) use -Djol.tryWithSudo=true to try with sudo; b) echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: Dec 16, 2022 6:56:14 PM io.airlift.log.Logger info
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: INFO: Java version: 17.0.5
[...]
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2022-12-16T18:56:26.113Z INFO    main    io.airlift.bootstrap.LifeCycleManager   Life cycle starting...
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2022-12-16T18:56:26.114Z INFO    main    io.airlift.bootstrap.LifeCycleManager   Life cycle started
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2022-12-16T18:56:26.115Z INFO    main    io.trino.connector.StaticCatalogManager -- Added catalog jmx using connector jmx --
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2022-12-16T18:56:26.115Z INFO    main    io.trino.connector.StaticCatalogManager -- Loading catalog memory --
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2022-12-16T18:56:26.186Z INFO    dispatcher-query-3  io.trino.event.QueryMonitor TIMELINE: Query 20221216_185626_00039_cuskt :: FAILED (SERVER_STARTING_UP) :: elapsed 0ms :: planning 0ms :: waiting 0ms :: scheduling 0ms :: running 0ms :: finishing 0ms :: begin 2022-12-16T18:56:26.186Z :: end 2022-12-16T18:56:26.186Z
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2022-12-16T18:56:26.306Z INFO    dispatcher-query-1  io.trino.event.QueryMonitor TIMELINE: Query 20221216_185626_00040_cuskt :: FAILED (SERVER_STARTING_UP) :: elapsed 0ms :: planning 0ms :: waiting 0ms :: scheduling 0ms :: running 0ms :: finishing 0ms :: begin 2022-12-16T18:56:26.306Z :: end 2022-12-16T18:56:26.306Z
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2022-12-16T18:56:26.382Z INFO    main    Bootstrap   PROPERTY                              DEFAULT  RUNTIME  DESCRIPTION
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2022-12-16T18:56:26.383Z INFO    main    Bootstrap   memory.enable-lazy-dynamic-filtering  true     true
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2022-12-16T18:56:26.383Z INFO    main    Bootstrap   memory.max-data-per-node              128MB    128MB
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2022-12-16T18:56:26.383Z INFO    main    Bootstrap   memory.splits-per-node                2        2
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2022-12-16T18:56:26.429Z INFO    dispatcher-query-3  io.trino.event.QueryMonitor TIMELINE: Query 20221216_185626_00041_cuskt :: FAILED (SERVER_STARTING_UP) :: elapsed 0ms :: planning 0ms :: waiting 0ms :: scheduling 0ms :: running 0ms :: finishing 0ms :: begin 2022-12-16T18:56:26.429Z :: end 2022-12-16T18:56:26.429Z
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2022-12-16T18:56:26.482Z INFO    main    io.airlift.bootstrap.LifeCycleManager   Life cycle starting...
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2022-12-16T18:56:26.483Z INFO    main    io.airlift.bootstrap.LifeCycleManager   Life cycle started
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2022-12-16T18:56:26.483Z INFO    main    io.trino.connector.StaticCatalogManager -- Added catalog memory using connector memory --
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2022-12-16T18:56:26.483Z INFO    main    io.trino.connector.StaticCatalogManager -- Loading catalog tpch --
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2022-12-16T18:56:26.494Z INFO    main    io.trino.connector.StaticCatalogManager -- Added catalog tpch using connector tpch --
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2022-12-16T18:56:26.498Z INFO    main    io.trino.security.AccessControlManager  Using system access control: default
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2022-12-16T18:56:26.531Z INFO    main    io.trino.server.Server  ======== SERVER STARTED ========
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2022-12-16T18:56:27.274Z ERROR   query-execution-6   io.trino.execution.scheduler.PipelinedQueryScheduler    Failure in distributed stage for query 20221216_185626_00042_cuskt
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: io.trino.spi.TrinoException: No nodes available to run query
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR:  at io.trino.execution.scheduler.UniformNodeSelector.computeAssignments(UniformNodeSelector.java:178)
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR:  at io.trino.execution.scheduler.DynamicSplitPlacementPolicy.computeAssignments(DynamicSplitPlacementPolicy.java:41)
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR:  at io.trino.execution.scheduler.SourcePartitionedScheduler.schedule(SourcePartitionedScheduler.java:292)
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR:  at io.trino.execution.scheduler.SourcePartitionedScheduler$1.schedule(SourcePartitionedScheduler.java:172)
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR:  at io.trino.execution.scheduler.PipelinedQueryScheduler$DistributedStagesScheduler.schedule(PipelinedQueryScheduler.java:1230)
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR:  at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR:  at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR:  at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR:  at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR:  at java.base/java.lang.Thread.run(Thread.java:833)
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2022-12-16T18:56:27.292Z INFO    dispatcher-query-1  io.trino.event.QueryMonitor TIMELINE: Query 20221216_185626_00042_cuskt :: FAILED (NO_NODES_AVAILABLE) :: elapsed 696ms :: planning 329ms :: waiting 273ms :: scheduling 367ms :: running 0ms :: finishing 367ms :: begin 2022-12-16T18:56:26.586Z :: end 2022-12-16T18:56:27.282Z
[main] INFO 🐳 [trinodb/trino:latest] - Container is started (JDBC URL: jdbc:trino://localhost:49864/)
[main] INFO 🐳 [trinodb/trino:latest] - Container trinodb/trino:latest started in PT14.723648S

Additional Information

starting the container suffices. Sometimes it does work out, so it is a timing issue.

eddumelendez commented 1 year ago

JFTR, it starts with version 374

Gingernaut commented 1 year ago

+1 on this issue

eddumelendez commented 1 year ago

At the time, I posted the previous comment tags from 374 to 405 were not working. However, taking a look at this again today seems that they are working fine 🤔. Logs doesn't contain the digest and during my tests didn't store them neither so not sure what happened here.

@hgschmie can you confirm this in your side, please? I've run 20 repeated tests and all green

hgschmie commented 1 year ago

will try.

hgschmie commented 1 year ago

still breaks for me:

[docker-java-stream-1217532399] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2023-01-11T19:07:06.962Z      INFO    main    io.trino.server.Server  ======== SERVER STARTED ========
[docker-java-stream-1217532399] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2023-01-11T19:07:07.995Z      ERROR   query-execution-6       io.trino.execution.scheduler.PipelinedQueryScheduler    Failure in distributed stage for query 20230111_190706_00058_7skfx
[docker-java-stream-1217532399] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: io.trino.spi.TrinoException: No nodes available to run query
[docker-java-stream-1217532399] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR:       at io.trino.execution.scheduler.UniformNodeSelector.computeAssignments(UniformNodeSelector.java:216)
[docker-java-stream-1217532399] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR:       at io.trino.execution.scheduler.DynamicSplitPlacementPolicy.computeAssignments(DynamicSplitPlacementPolicy.java:41)
[docker-java-stream-1217532399] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR:       at io.trino.execution.scheduler.SourcePartitionedScheduler.schedule(SourcePartitionedScheduler.java:292)
[docker-java-stream-1217532399] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR:       at io.trino.execution.scheduler.SourcePartitionedScheduler$1.schedule(SourcePartitionedScheduler.java:172)
[docker-java-stream-1217532399] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR:       at io.trino.execution.scheduler.PipelinedQueryScheduler$DistributedStagesScheduler.schedule(PipelinedQueryScheduler.java:1235)
[docker-java-stream-1217532399] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR:       at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
[docker-java-stream-1217532399] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR:       at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
[docker-java-stream-1217532399] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR:       at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
[docker-java-stream-1217532399] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR:       at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
[docker-java-stream-1217532399] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR:       at java.base/java.lang.Thread.run(Thread.java:833)

This seems to be an inherent trino problem. The server is ready but has not yet registered any processing nodes.

It is harder to reproduce in 405 (probably 10 runs until I got the error) but not gone.

szymonm commented 1 year ago

I'm hitting the same issue.

Created the test that fails pretty often on my machine (Intel Mac). Please, have a look here: https://github.com/testcontainers/testcontainers-java/pull/6563/files

I think I understand the issue. The query code we use for checking that container is ready is:

boolean testQuerySucceeded = statement.execute(this.getTestQueryString());
if (testQuerySucceeded) {
  logger().info("Container is started (JDBC URL: {})", this.getJdbcUrl());
  return;
}

Unfortunately, Trino enters some state when it returns from the execute method but it fails with No nodes available to run query when trying to fetch the data from the returned ResultSet. Thus, our check succeeds but Trino is not ready to serve queries that return data. In some other tests I successfully create tables and insert data, but fai later on querying it.

@findepi I see you contributed some code to TrinoContainer and know Trino well. What are your thoughts?

Solutions that come to my mind include:

  1. In TrinoContainer overriding the ready check code to include checking for real data retrieval.
  2. Changing it globally on JDBCContainer level.

WDYT?

findepi commented 1 year ago

@findepi I see you contributed some code to TrinoContainer and know Trino well. What are your thoughts?

thanks @szymonm for the ping. yes, i think i contributed that class

the container defines the readiness check

    public String getTestQueryString() {
        return "SELECT count(*) FROM tpch.tiny.nation";

i think that when I was working on the container, this was sufficient. I think it's not very practical for Trino in a single-node setup to accept a query and fail with No nodes available to run query, so I would treat it as a Trino problem, that we may or may not choose to workaround within testcontainers.

@szymonm @hgschmie if you're able to reproduce the problem using trinodb/trino docker image directly (without testcontainers library), I would recommend filing the bug report with the Trino project.