apache / incubator-streampark

Make stream processing easier! Easy-to-use streaming application development framework and operation platform.
https://streampark.apache.org/
Apache License 2.0
3.82k stars 983 forks source link

[Bug] Streampark Application state is not right with yarn application #3722

Closed zyfbbd closed 1 month ago

zyfbbd commented 2 months ago

Search before asking

Java Version

jdk8

Scala Version

2.12.x

StreamPark Version

2.0.0

Flink Version

1.14.4

deploy mode

yarn-application

What happened

Streampark platform app run status is failed,But yarn application is running.

Error Exception

java.util.concurrent.CompletionException: java.lang.reflect.InvocationTargetException
    at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273)
    at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280)
    at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1606)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.streampark.flink.client.FlinkClient$.$anonfun$cancel$1(FlinkClient.scala:73)
    at org.apache.streampark.flink.proxy.FlinkShimsProxy$.$anonfun$proxy$1(FlinkShimsProxy.scala:63)
    at org.apache.streampark.common.util.ClassLoaderUtils$.runAsClassLoader(ClassLoaderUtils.scala:40)
    at org.apache.streampark.flink.proxy.FlinkShimsProxy$.proxy(FlinkShimsProxy.scala:63)
    at org.apache.streampark.flink.client.FlinkClient$.cancel(FlinkClient.scala:68)
    at org.apache.streampark.flink.client.FlinkClient.cancel(FlinkClient.scala)
    at org.apache.streampark.console.core.service.impl.ApplicationServiceImpl.lambda$cancel$6(ApplicationServiceImpl.java:1235)
    at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
    ... 3 more
Caused by: org.apache.flink.util.FlinkException: [StreamPark] Triggering a savepoint for the job 1391271bf9f3b0234c4abc67b29233e6 failed. detail: java.util.concurrent.TimeoutException
    at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1784)
    at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1928)
    at org.apache.streampark.flink.client.trait.FlinkSubmitTrait.cancelJob(FlinkSubmitTrait.scala:482)
    at org.apache.streampark.flink.client.trait.FlinkSubmitTrait.cancelJob$(FlinkSubmitTrait.scala:451)
    at org.apache.streampark.flink.client.impl.YarnApplicationSubmit$.org$apache$streampark$flink$client$trait$YarnSubmitTrait$$super$cancelJob(YarnApplicationSubmit.scala:46)
    at org.apache.streampark.flink.client.trait.YarnSubmitTrait.$anonfun$doCancel$1(YarnSubmitTrait.scala:53)
    at scala.util.Try$.apply(Try.scala:209)
    at org.apache.streampark.flink.client.trait.YarnSubmitTrait.doCancel(YarnSubmitTrait.scala:52)
    at org.apache.streampark.flink.client.trait.YarnSubmitTrait.doCancel$(YarnSubmitTrait.scala:39)
    at org.apache.streampark.flink.client.impl.YarnApplicationSubmit$.doCancel(YarnApplicationSubmit.scala:46)
    at org.apache.streampark.flink.client.trait.FlinkSubmitTrait.cancel(FlinkSubmitTrait.scala:159)
    at org.apache.streampark.flink.client.trait.FlinkSubmitTrait.cancel$(FlinkSubmitTrait.scala:143)
    at org.apache.streampark.flink.client.impl.YarnApplicationSubmit$.cancel(YarnApplicationSubmit.scala:46)
    at org.apache.streampark.flink.client.FlinkClientHandler$.cancel(FlinkClientHandler.scala:43)
    at org.apache.streampark.flink.client.FlinkClientHandler.cancel(FlinkClientHandler.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.streampark.flink.client.FlinkClient$.$anonfun$cancel$1(FlinkClient.scala:73)
    at org.apache.streampark.flink.proxy.FlinkShimsProxy$.$anonfun$proxy$1(FlinkShimsProxy.scala:63)
    at org.apache.streampark.common.util.ClassLoaderUtils$.runAsClassLoader(ClassLoaderUtils.scala:40)
    at org.apache.streampark.flink.proxy.FlinkShimsProxy$.proxy(FlinkShimsProxy.scala:63)
    at org.apache.streampark.flink.client.FlinkClient$.cancel(FlinkClient.scala:68)
    at org.apache.streampark.flink.client.FlinkClient.cancel(FlinkClient.scala)
    at org.apache.streampark.console.core.service.impl.ApplicationServiceImpl.lambda$cancel$6(ApplicationServiceImpl.java:1235)
    at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

    at org.apache.streampark.flink.client.trait.YarnSubmitTrait$$anonfun$doCancel$2.applyOrElse(YarnSubmitTrait.scala:57)
    at org.apache.streampark.flink.client.trait.YarnSubmitTrait$$anonfun$doCancel$2.applyOrElse(YarnSubmitTrait.scala:55)
    at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:34)
    at scala.util.Failure.recover(Try.scala:230)
    at org.apache.streampark.flink.client.trait.YarnSubmitTrait.doCancel(YarnSubmitTrait.scala:55)
    at org.apache.streampark.flink.client.trait.YarnSubmitTrait.doCancel$(YarnSubmitTrait.scala:39)
    at org.apache.streampark.flink.client.impl.YarnApplicationSubmit$.doCancel(YarnApplicationSubmit.scala:46)
    at org.apache.streampark.flink.client.trait.FlinkSubmitTrait.cancel(FlinkSubmitTrait.scala:159)
    at org.apache.streampark.flink.client.trait.FlinkSubmitTrait.cancel$(FlinkSubmitTrait.scala:143)
    at org.apache.streampark.flink.client.impl.YarnApplicationSubmit$.cancel(YarnApplicationSubmit.scala:46)
    at org.apache.streampark.flink.client.FlinkClientHandler$.cancel(FlinkClientHandler.scala:43)
    at org.apache.streampark.flink.client.FlinkClientHandler.cancel(FlinkClientHandler.scala)
    ... 15 more

Screenshots

No response

Are you willing to submit PR?

Code of Conduct

wolfboys commented 1 month ago

2.1.4 has already been released, and this bug has already been fixed, you can upgrade to version 2.1.4 and try it out.