kubeflow / spark-operator

Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
Apache License 2.0
2.75k stars 1.37k forks source link

executer cannot communicate with Thrift server #1283

Open jstomphorst opened 3 years ago

jstomphorst commented 3 years ago

Hi guys,

I am staring a thrift server, and that thrift server starten an exceuter.. But that exceuter wants to cimmunicate with the thrift server on weird ports. Below my errors and config.

Tnx!

with kubernetes config.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: spark-thrift-server
  #  namespace: spark-operator
  labels:
    foo: bar
spec:
  replicas: 1
  selector:
    matchLabels:
      foo: bar
  template:
    metadata:
      labels:
        foo: bar
    spec:
      containers:
      - name: spark-thrift-server
        image: gcr.io/spark-operator/spark:v3.0.0
        args:
          -  /opt/spark/bin/spark-submit
          - --master
          - k8s://https://kubernetes.default.svc.itr-dsp-ot-k8s.privatehybridcloud.eu:443
          - --class
          - org.apache.spark.sql.hive.thriftserver.HiveThriftServer2
          - --deploy-mode
          - client
          - --name
          - spark-thrift
          - --hiveconf
          - hive.server2.thrift.port 10000
          - --conf
          - spark.executor.instances=1
          - --conf
          - spark.executor.memory=512M
          - --conf
          - spark.driver.memory=512M
          - --conf
          - spark.executor.cores=1
          - --conf
          - spark.kubernetes.namespace=default
          - --conf
          - spark.kubernetes.container.image=gcr.io/spark-operator/spark:v3.0.0
          - --conf
          - spark.kubernetes.authenticate.driver.serviceAccountName=spark
          - --conf
          - spark.kubernetes.driver.pod.name=$(THRIFT_POD_NAME)
          - --conf
          - spark.driver.bindAddress=$(THRIFT_POD_IP)
          - --conf
          - spark.driver.host=spark-thrift-server
        env:
        - name: THRIFT_POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: THRIFT_POD_IP
          valueFrom:
            fieldRef:
              fieldPath: status.podIP
        ports:
        - containerPort: 4040
          name: spark-ui
          protocol: TCP
        - containerPort: 10000
          name: spark-thrift
          protocol: TCP
      serviceAccount: spark
      serviceAccountName: spark
---
apiVersion: v1
kind: Service
metadata:
  name: spark-thrift-server
  #namespace: spark-operator
spec:
  ports:
  - name: spark-ui
    port: 4040
    protocol: TCP
    targetPort: 4040
  - name: spark-thrift
    port: 10000
    protocol: TCP
    targetPort: 10000
  - name: spark-thrift1
    port: 44477
    protocol: TCP
    targetPort: 10000
  selector:
    foo: bar
  sessionAffinity: None
  type: LoadBalancer
jdonnelly-apixio commented 3 years ago

@jstomphorst Not positive, but I think you might need spark.hadoop.hive.server2.thrift.port defined to 10k also. My working args section:

          - /opt/spark/bin/spark-submit
          - --master
          - k8s://https://$(KUBERNETES_SERVICE_HOST):443
          - --class
          - org.apache.spark.sql.hive.thriftserver.HiveThriftServer2
          - --deploy-mode
          - client
          - --name
          - spark-sql
          - --hiveconf
          - hive.server2.thrift.port 10000
          - --conf
          - spark.executor.instances=1
          - --conf
          - spark.executor.memory=2G
          - --conf
          - spark.driver.memory=2G
          - --conf
          - spark.executor.cores=2
          - --conf
          - spark.kubernetes.namespace=spark-operator
          - --conf
          - spark.kubernetes.container.image=xxx/spark-app:v1.0.3
          - --conf
          - spark.kubernetes.authenticate.driver.serviceAccountName=spark-operator
          - --conf
          - spark.kubernetes.driver.pod.name=$(THRIFT_POD_NAME)
          - --conf
          - spark.driver.bindAddress=$(THRIFT_POD_IP)
          - --conf
          - spark.hadoop.hive.metastore.client.connect.retry.delay=5
          - --conf
          - spark.hadoop.hive.metastore.client.socket.timeout=1800
          - --conf
          - spark.hadoop.hive.metastore.uris=thrift://my-metastore:9083
          - --conf
          - spark.hadoop.hive.server2.enable.doAs=false
          - --conf
          - spark.hadoop.hive.server2.thrift.port=10000
          - --conf
          - spark.hadoop.hive.server2.transport.mode=binary
          - --conf
          - spark.hadoop.metastore.catalog.default=spark
          - --conf
          - spark.hadoop.hive.execution.engine=spark
          - --conf
          - spark.hadoop.hive.input.format=io.delta.hive.HiveInputFormat
          - --conf
          - spark.hadoop.hive.tez.input.format=io.delta.hive.HiveInputFormat
          - --conf
          - spark.sql.warehouse.dir=s3a://xxx \
          - --conf
          - spark.hadoop.fs.defaultFS=s3a://xxx \
          - --conf
          - spark.hadoop.fs.s3a.connection.ssl.enabled=true \
          - --conf
          - spark.hadoop.fs.s3a.endpoint=https://s3.us-west-2.amazonaws.com \
          - --conf
          - spark.hadoop.fs.s3a.impl=org.apache.hadoop.fs.s3a.S3AFileSystem \
          - --conf
          - spark.hadoop.fs.s3a.fast.upload=true \
          - --conf
          - spark.hadoop.fs.s3a.path.style.access=true \
jstomphorst commented 3 years ago

I've found the problem

apiVersion: v1 kind: Service metadata: name: spark-thrift-server

namespace: spark-operator

spec: ports:

The Thriftserver needs a direct connection

apiVersion: v1 kind: Service metadata: name: spark-thrift-server spec: clusterIP: None ports:

so my solution is, 2 services:

ind: Service metadata: name: spark-thrift-server spec: clusterIP: None ports:

github-actions[bot] commented 6 days ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.