GoogleCloudPlatform / flink-on-k8s-operator

[DEPRECATED] Kubernetes operator for managing the lifecycle of Apache Flink and Beam applications.
Apache License 2.0
658 stars 266 forks source link

Flink Session Cluster REST handler exception #465

Closed acesir closed 2 years ago

acesir commented 2 years ago

We are trying to test the session cluster flink:1.13.0 with flink-operator:latest image and are running into an issue when submitting jobs. Inside the job-manager the bellow error appears:

22:21:03.748 [Flink-DispatcherRestEndpoint-thread-1] ERROR org.apache.flink.runtime.webmonitor.handlers.JarRunHandler - Exception occurred in REST handler: Could not execute application.

On the UI itself this error shows in the notification section: Could not execute application. at org.apache.flink.runtime.webmonitor.handlers.JarRunHandler.lambda$handleRequest$1( at java.util.concurrent.CompletableFuture.uniHandle( at java.util.concurrent.CompletableFuture$UniHandle.tryFire( at java.util.concurrent.CompletableFuture.postComplete( at java.util.concurrent.CompletableFuture$ at java.util.concurrent.Executors$ at at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201( at java.util.concurrent.ScheduledThreadPoolExecutor$ at java.util.concurrent.ThreadPoolExecutor.runWorker( at java.util.concurrent.ThreadPoolExecutor$ at Caused by: java.util.concurrent.CompletionException: org.apache.flink.util.FlinkRuntimeException: Could not execute application. at java.util.concurrent.CompletableFuture.encodeThrowable( at java.util.concurrent.CompletableFuture.completeThrowable( at java.util.concurrent.CompletableFuture$ ... 7 more Caused by: org.apache.flink.util.FlinkRuntimeException: Could not execute application. at org.apache.flink.client.deployment.application.DetachedApplicationRunner.tryExecuteJobs( at at org.apache.flink.runtime.webmonitor.handlers.JarRunHandler.lambda$handleRequest$0( at java.util.concurrent.CompletableFuture$ ... 7 more Caused by: org.apache.flink.client.program.ProgramInvocationException: The main method caused an error: Job was submitted in detached mode. Results of job execution, such as accumulators, runtime, etc. are not available. Please make sure your program doesn't call an eager execution function [collect, print, printToErr, count]. at org.apache.flink.client.program.PackagedProgram.callMainMethod( at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution( at org.apache.flink.client.ClientUtils.executeProgram( at org.apache.flink.client.deployment.application.DetachedApplicationRunner.tryExecuteJobs( ... 10 more Caused by: org.apache.flink.api.common.InvalidProgramException: Job was submitted in detached mode. Results of job execution, such as accumulators, runtime, etc. are not available. Please make sure your program doesn't call an eager execution function [collect, print, printToErr, count]. at org.apache.flink.core.execution.DetachedJobExecutionResult.getAccumulatorResult( at at at at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke( at sun.reflect.DelegatingMethodAccessorImpl.invoke( at java.lang.reflect.Method.invoke( at org.apache.flink.client.program.PackagedProgram.callMainMethod( ... 13 more

What is strange is the actual job completes fine but the messages on the UI are very cumbersome. Does anyone have any idea of the compatibility of the flink-operator:latest image with using session cluster flink:1.13.0 image. Any help here is appreciated on the compatibility. We are using the latest chart just different session cluster image.

acesir commented 2 years ago

It seems that running job in detached more from the UI requires --output arguments. adding this with a valid volume resolves the REST exception.