kubeflow / spark-operator

Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
Apache License 2.0
2.69k stars 1.35k forks source link

[BUG] spark.eventLog.enable and spark.eventLog.dir not working #2018

Open ahululu opened 1 month ago

ahululu commented 1 month ago

Description

Please provide a clear and concise description of the issue you are encountering, and a reproduction of your configuration.

If your request is for a new feature, please use the Feature request template.

Reproduction Code [Required]

  hadoopConf:
    # EMRFS filesystem
    fs.s3.customAWSCredentialsProvider: com.amazonaws.auth.WebIdentityTokenCredentialsProvider
    fs.s3.impl: org.apache.hadoop.fs.s3a.S3AFileSystem
    fs.s3a.impl: org.apache.hadoop.fs.s3a.S3AFileSystem
    fs.s3a.endpoint: s3.us-east-1.amazonaws.com
    fs.s3.buffer.dir: /mnt/s3
    fs.s3.getObject.initialSocketTimeoutMilliseconds: "2000"
    mapreduce.fileoutputcommitter.algorithm.version.emr_internal_use_only.EmrFileSystem: "2"
    mapreduce.fileoutputcommitter.cleanup-failures.ignored.emr_internal_use_only.EmrFileSystem: "true"
  sparkConf:
    # Required for EMR Runtime
    spark.driver.extraClassPath: /usr/share/aws/aws-java-sdk-v2/*:/usr/lib/hudi/*:/usr/lib/hadoop-lzo/lib/*:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/*:/usr/share/aws/emr/emrfs/auxlib/*:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar:/home/hadoop/extrajars/*
    spark.driver.extraLibraryPath: /usr/share/aws/aws-java-sdk-v2/*:/usr/lib/hudi/*:/usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native:/docker/usr/lib/hadoop/lib/native:/docker/usr/lib/hadoop-lzo/lib/native
    spark.executor.extraClassPath: /usr/share/aws/aws-java-sdk-v2/*:/usr/lib/hudi/*:/usr/lib/hadoop-lzo/lib/*:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/*:/usr/share/aws/emr/emrfs/auxlib/*:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar:/home/hadoop/extrajars/*
    spark.executor.extraLibraryPath: /usr/share/aws/aws-java-sdk-v2/*:/usr/lib/hudi/*:/usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native:/docker/usr/lib/hadoop/lib/native:/docker/usr/lib/hadoop-lzo/lib/native
    spark.hadoop.hive.metastore.client.factory.class: com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory
    # History logs
    spark.eventLog.dir: s3://abc/def/
    spark.eventLog.enable: "true"

Steps to reproduce the behavior:

Expected behavior

It will produce a log file at s3://abc/def/

Actual behavior

nothing

Environment & Versions

peter-mcclonski commented 1 month ago

Two items of note: 1- Try using s3a:// rather than s3:// 2- Try adding the following to your sparkConf: spark.eventLog.logBlockUpdates.enabled: "true"

ahululu commented 1 month ago

Two items of note: 1- Try using s3a:// rather than s3:// 2- Try adding the following to your sparkConf: spark.eventLog.logBlockUpdates.enabled: "true"

not working.

ChenYi015 commented 2 days ago

@ahululu Make sure webhook is enabled so hadoop/spark conf can be mounted as configmap. And could you provide detailed error message about event log not working?