prestodb / presto

The official home of the Presto distributed SQL query engine for big data
http://prestodb.io
Apache License 2.0
16.07k stars 5.38k forks source link

AbstractTestNativeTpchQueries TPCH queries retries exhausted #19480

Open karteekmurthys opened 1 year ago

karteekmurthys commented 1 year ago

Some of the e2e tests are failing with following error:

com.facebook.presto.nativeworker.AbstractTestNativeTpchQueries.testTpchQ18(AbstractTestNativeTpchQueries.java:172)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.testng.internal.invokers.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:135)
    at org.testng.internal.invokers.TestInvoker.invokeMethod(TestInvoker.java:673)
    at org.testng.internal.invokers.TestInvoker.invokeTestMethod(TestInvoker.java:220)
    at org.testng.internal.invokers.MethodRunner.runInSequence(MethodRunner.java:50)
    at org.testng.internal.invokers.TestInvoker$MethodInvocationAgent.invoke(TestInvoker.java:945)
    at org.testng.internal.invokers.TestInvoker.invokeTestMethods(TestInvoker.java:193)
    at org.testng.internal.invokers.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:146)
    at org.testng.internal.invokers.TestMethodWorker.run(TestMethodWorker.java:128)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:750)
Caused by: java.lang.RuntimeException: Failed to fetched data from 127.0.0.1:1235 /v1/task/20230425_211022_00008_x52vw.6.0.3/results/1/0 - Exhausted retries: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connection refused)

There are 5 such failures in this run: https://github.com/prestodb/presto/actions/runs/4801103339/jobs/8542879139?pr=19432

mbasmanova commented 1 year ago

CC: @aditi-pandit @tanjialiang

mbasmanova commented 1 year ago

CC: @majetideepak

isadikov commented 1 year ago

I could also reproduce this when running TPC-DS queries in a Prestissimo integration test. Even if you change worker ports, the queries work fine for a while but then would soon start failing again.