apache / incubator-streampark

Make stream processing easier! Easy-to-use streaming application development framework and operation platform.
https://streampark.apache.org/
Apache License 2.0
3.91k stars 1.01k forks source link

[Bug] update running job, will set job clusterId to null, leading to Streampark cannot get status for the job #4075

Closed xieyi888 closed 1 month ago

xieyi888 commented 2 months ago

Search before asking

Java Version

1.8

Scala Version

2.11.x

StreamPark Version

dev

Flink Version

1.16.2

deploy mode

yarn-application

What happened

deploy mode: yarn application yarn session

below is the error example for yarn application mode

  1. a running job image job cluster_id is application_1726758048768_39519 image

  2. update job and submit

image flink/app/update controller did not pass clusterId during update job image

after update job, job cluster_id is set to null image

  1. stop the job

the job stoped failed image

error stack

java.util.concurrent.CompletionException: java.lang.reflect.InvocationTargetException
    at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273)
    at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280)
    at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1606)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:750)
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.streampark.flink.client.FlinkClient$.$anonfun$proxy$1(FlinkClient.scala:89)
    at org.apache.streampark.flink.proxy.FlinkShimsProxy$.$anonfun$proxy$1(FlinkShimsProxy.scala:72)
    at org.apache.streampark.common.util.ClassLoaderUtils$.runAsClassLoader(ClassLoaderUtils.scala:44)
    at org.apache.streampark.flink.proxy.FlinkShimsProxy$.proxy(FlinkShimsProxy.scala:72)
    at org.apache.streampark.flink.client.FlinkClient$.proxy(FlinkClient.scala:84)
    at org.apache.streampark.flink.client.FlinkClient$.cancel(FlinkClient.scala:62)
    at org.apache.streampark.flink.client.FlinkClient.cancel(FlinkClient.scala)
    at org.apache.streampark.console.core.service.application.impl.ApplicationActionServiceImpl.lambda$cancel$0(ApplicationActionServiceImpl.java:314)
    at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
    ... 3 more
Caused by: java.lang.IllegalArgumentException: [StreamPark] getClusterClient error. No cluster id was specified. Please specify a cluster to which you would like to connect.
    at org.apache.streampark.common.util.AssertUtils$.required(AssertUtils.scala:59)
    at org.apache.streampark.flink.client.trait.YarnClientTrait.executeClientAction(YarnClientTrait.scala:59)
    at org.apache.streampark.flink.client.trait.YarnClientTrait.doCancel(YarnClientTrait.scala:90)
    at org.apache.streampark.flink.client.trait.YarnClientTrait.doCancel$(YarnClientTrait.scala:86)
    at org.apache.streampark.flink.client.impl.YarnApplicationClient$.doCancel(YarnApplicationClient.scala:47)
    at org.apache.streampark.flink.client.trait.FlinkClientTrait.cancel(FlinkClientTrait.scala:229)
    at org.apache.streampark.flink.client.trait.FlinkClientTrait.cancel$(FlinkClientTrait.scala:212)
    at org.apache.streampark.flink.client.impl.YarnApplicationClient$.cancel(YarnApplicationClient.scala:47)
    at org.apache.streampark.flink.client.FlinkClientEntrypoint$.cancel(FlinkClientEntrypoint.scala:48)
    at org.apache.streampark.flink.client.FlinkClientEntrypoint.cancel(FlinkClientEntrypoint.scala)
    ... 16 more

Error Exception

No response

Screenshots

No response

Are you willing to submit PR?

Code of Conduct