kubeflow / spark-operator

Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
Apache License 2.0
2.79k stars 1.37k forks source link

Please support istio with auto inject #889

Open AceHack opened 4 years ago

AceHack commented 4 years ago

When istio is set to auto inject the executors fail to talk to the driver and jobs never finish. Please add first-class support for istio.

ringtail commented 4 years ago

@AceHack It's a little different in Istio. Istio don't support headless kubernetes svc by default. @liyinan926 Should we expose how to create the driver svc ?

AceHack commented 4 years ago

Istio does support headless services as of 1.4 https://github.com/istio/istio/issues/7495

Istio does not work with job like things in Kubernetes https://github.com/istio/istio/issues/6324

The problem is spark would need to use something like envoy pre-flight to properly wait for envoy to come up and then shut down pre-flight when the spark application ends.

https://github.com/monzo/envoy-preflight

We are using envoy-preflight for all other job in Kubernetes.

dmoore62 commented 4 years ago

One work around that we tested is configuring the application to run outside the istio service mesh.

AceHack commented 4 years ago

Running outside the service mesh is not really a valid work around we want all the observability a service mesh brings with our spark jobs.

dmoore62 commented 4 years ago

Working with the upstream community, we have added a configuration workaround for Zookeeper along with Casssandra, Elasticsearch, Redis and Apache NiFi. I am sure there are additional applications that are not compatible with sidecars. Please inform the community if you know of any.

https://www.cncf.io/blog/2020/10/26/service-mesh-is-still-hard/

We have yet to really dig into enhancements in Spark 3, but we may need to accept that we are apart of the "applications that are not compatible with sidecars."

afilipchik commented 3 years ago

hi, what is exactly incompatible? we added an istio waiter to the image and executor starts. But then fails with: 20/11/19 07:55:58 INFO TransportClientFactory: Successfully created connection to spark.spark.svc/10.X.X.X:7078 after 99 ms (0 ms spent in bootstraps) 20/11/19 07:55:58 ERROR TransportResponseHandler: Still have 1 requests outstanding when connection from spark.spark.svc/10.X.X.X:7078 is closed Exception in thread "main" java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1713) at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:64) at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:188) at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:281) at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala) Caused by: org.apache.spark.SparkException: Exception thrown in awaitResult: at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:226) at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75) at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:101) at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:201) at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:65) at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:64) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) ... 4 more Caused by: java.io.IOException: Connection from spark.spark.svc/10.X.X.X:7078 closed at org.apache.spark.network.client.TransportResponseHandler.channelInactive(TransportResponseHandler.java:146) at org.apache.spark.network.server.TransportChannelHandler.channelInactive(TransportChannelHandler.java:108) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) at io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75) at io.netty.handler.timeout.IdleStateHandler.channelInactive(IdleStateHandler.java:277) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) at io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) at io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75) at org.apache.spark.network.util.TransportFrameDecoder.channelInactive(TransportFrameDecoder.java:182) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1354) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:917) at io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:822) at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463) at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138) at java.lang.Thread.run(Thread.java:748)

What can it be?

puneetloya commented 3 years ago

Its discussed here: https://github.com/istio/istio/issues/27900. So may be Istio has a fix in upstream but may have not released it yet.

bleggett commented 3 years ago

Its discussed here: istio/istio#27900. So may be Istio has a fix in upstream but may have not released it yet.

There's nothing for Istio to fix there - running Spark in vanilla standalone cluster mode thru Istio requires a small amount of Spark master configuration is all - I've tested Spark and Istio in that fashion and it works fine, workers talk to masters thru Istio via headless services.

It seems like there's actually 2 potential smaller issues here?

  1. Spark workers can't talk to Spark masters with Istio and spark-on-k8s-operator - if this is an issue people are seeing, it's a Spark configuration issue, not a bug on the Istio side - logs would be a big help.

  2. Istio (really pod-level Envoy proxies) doesn't work with transient workloads like Jobs - this is correct, and there are a variety of workarounds for this issue - it's unlikely Istio will come up with a fix for this since Envoy proxies by design do not run to completion and exit with a return code of 0, and workloads being able to terminate their own proxy is a violation of the security model and not something I see Istio itself supporting in the long term (though k8s itself might, eventually)

AceHack commented 3 years ago

2 is the problem I'm seeing, also what is the master configuration you speak of?

puneetloya commented 3 years ago

As @bleggett said, 2 has a number of workarounds. @AceHack already mentioned envoy-preflight. For 1, In Spark apps created by Spark operator and In general for Spark on k8s, Driver listens on the POD_IP. This part is hardcoded in the Spark code.

https://github.com/apache/spark/blob/3a299aa6480ac22501512cd0310d31a441d7dfdc/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/DriverServiceFeatureStep.scala#L34

Istio has been dependent on the primary container to be listening on 127.0.0.1. istio/istio#27900 is a fix for that, Istio should be able to work when the driver pod listens on POD_IP.

Spark Master and workers, they are a different way of deployment. In a way they are independent of being deployed on Kubernetes or Mesos as they are not using the platform to create workers/executors.

bleggett commented 3 years ago

@AceHack @puneetloya For vanilla Spark, all you have to do is set the SPARK_MASTER_HOST environment variable to 127.0.0.1 - no Istio changes are required.

Example values.yaml I pass to the Bitnami Spark chart:

master:
    extraEnvVars:
     - name: SPARK_MASTER_HOST
       value: "127.0.0.1"

That's all you need for vanilla Spark and Istio to play nice. istio/istio#27900 isn't strictly necessary for Spark bug 1, since Spark config supports binding to local addresses just fine out of the box with the right config.

haf commented 3 years ago

@bleggett I think this advice is outdated with v1.10 that changes the default route envoy forward inbound packets to, to be eth0 rather than lo.

Also on this topic, can we make this operator name the ports according to their protocols?

$ istioctl analyze -n flows
Info [IST0118] (Service sparkpi-py-0ae54679adebbb65-driver-svc.flows) Port name blockmanager (port: 7079, targetPort: 7079) doesn't follow the naming convention of Istio port.
Info [IST0118] (Service sparkpi-py-0ae54679adebbb65-driver-svc.flows) Port name driver-rpc-port (port: 7078, targetPort: 7078) doesn't follow the naming convention of Istio port.
Info [IST0118] (Service sparkpi-py-0ae54679adebbb65-driver-svc.flows) Port name spark-ui (port: 4040, targetPort: 4040) doesn't follow the naming convention of Istio port.
Info [IST0118] (Service sparkpi-py-ui-svc.flows) Port name spark-driver-ui-port (port: 4040, targetPort: 4040) doesn't follow the naming convention of Istio port.
jkleckner commented 3 years ago

@haf Does #1239 address this?

puneetloya commented 3 years ago

Info [IST0118] (Service sparkpi-py-0ae54679adebbb65-driver-svc.flows) Port name driver-rpc-port (port: 7078, targetPort: 7078) doesn't follow the naming convention of Istio port.

AFAIK, the naming of ports is not controlled by the spark operator. Its a code change required in the Apache Spark distribution(this was valid atleast till Spark 2.4, I am not sure if this was changed in Spark 3). The headless service is created by the driver once it is up, to make itself discoverable to executors.

haf commented 3 years ago

@jkleckner Yes, but it would probably make sense to change the default name, too.

MrTrustor commented 3 years ago

Asking here, since it's seems to be the thread. Currently trying to make the Spark operator work with the following stack:

  1. I have enabled the istio autoinjection on the namespace where I create my SparkApplication (I'm using the spark-pi example)
  2. I have customised the docker image I'm using to include scuttle
  3. I have tried to set the mTLS policy of istio to DISABLE and PERMISSIVE via a PeerAuthentication resource

Whatever I do, the executor fails with the following error:

21/08/24 16:43:47 INFO TransportClientFactory: Successfully created connection to spark-pi-bb55307b78fb84d6-driver-svc.spark.svc/192.168.5.156:7078 after 151 ms (0 ms spent in bootstraps)
21/08/24 16:43:47 ERROR TransportClient: Failed to send RPC RPC 7365963296669017718 to spark-pi-bb55307b78fb84d6-driver-svc.spark.svc/192.168.5.156:7078: java.nio.channels.ClosedChannelException
java.nio.channels.ClosedChannelException
    at io.netty.channel.AbstractChannel$AbstractUnsafe.newClosedChannelException(AbstractChannel.java:957)
    at io.netty.channel.AbstractChannel$AbstractUnsafe.write(AbstractChannel.java:865)
...

I'm not sure what I'm doing wrong: from various comments on this thread and others, this should work, no? The issue with the istio master listening on the pod ip and not 127.0.0.1 has been fixed in the versions I'm using.

If anyone has an idea, I'd be very thankful!

meggers commented 2 years ago

Hi @MrTrustor, did you resolve your issue? I am running into the same thing, though with slightly older istio so I am setting SPARK_MASTER_HOST

MrTrustor commented 2 years ago

No, I haven't. We had to run Spark without istio in the end.

meggers commented 2 years ago

Failed to send RPC RPC xx to xx: java.nio.channels.ClosedChannelException

FWIW, I was able to get executor->driver communication to work by overriding the env var SPARK_DRIVER_BIND_ADDRESS to 127.0.0.1 on the driver. The default is hardcoded to PodIP and was not working with our version of istio.

Owen-CH-Leung commented 1 year ago

We were facing the same error as well org.apache.spark.shuffle.FetchFailedException: Failed to send RPC RPC 5700455101843038265 to /xxx: java.nio.channels.ClosedChannelException, and we were able to get communication between driver & executor pod up & running by the following :

  1. Set up the blockManager port by adding the SparkConf spark.blockManager.port: "7080"
  2. Create the following headless service :
    apiVersion: v1
    kind: Service
    metadata:
    name: spark-executor
    spec:
    clusterIP: None
    ports:
    - name: blockmanager
    protocol: TCP
    port: 7080
    selector:
    spark-role: executor

Note that it doesn't have to be 7080 port, instead it can be any port number. By default it's assigning a random port.

Credits to @dayyeung

github-actions[bot] commented 1 week ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.