stackabletech / docker-images

Apache License 2.0
17 stars 2 forks source link

Spark 4.0.0-preview1 findings #772

Open maltesander opened 1 month ago

maltesander commented 1 month ago

I tried to include the spark 4.0.0-preview1 image into 24.7.

While the build worked fine using these settings:

    {
        "product": "4.0.0-preview1",
        "java-base": "17",
        "java-devel": "17",
        "python": "3.11",
        "hadoop_long_version": "3.4.0",  # https://github.com/apache/spark/blob/7a7a8bc4bab591ac8b98b2630b38c57adf619b82/pom.xml#L125
        # NOTE: The "aws_java_sdk_bundle" jar now is only called bundle-x.x.x instead of aws-java-sdk-bundle-x.x.x and was renamed before uploading to nexus
        "aws_java_sdk_bundle": "2.23.19",  # https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-aws/3.4.0
        "azure_storage": "7.0.1",  # https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-azure/3.4.0
        "azure_keyvault_core": "1.0.0",  # https://mvnrepository.com/artifact/com.microsoft.azure/azure-storage/7.0.1
        "jackson_dataformat_xml": "2.17.1",  # https://mvnrepository.com/artifact/org.apache.spark/spark-core_2.13/4.0.0-preview1
        "stax2_api": "4.2.2",  # https://mvnrepository.com/artifact/com.fasterxml.jackson.dataformat/jackson-dataformat-xml/2.17.1
        "woodstox_core": "6.6.2",  # https://mvnrepository.com/artifact/com.fasterxml.jackson.dataformat/jackson-dataformat-xml/2.17.1
        "vector": "0.39.0",
        "jmx_exporter": "1.0.1",
        "tini": "0.19.0",
    },

I got some integration test failures and we decided to not include it in the 24.7 release.

        --- FAIL: kuttl/harness/iceberg_openshift-false_spark-4.0.0-preview1 (305.58s) -> iceberg jar missing (not released yet?)
        --- FAIL: kuttl/harness/logging_openshift-false_spark-4.0.0-preview1_ny-tlc-report-0.2.0 (894.33s)
        --- FAIL: kuttl/harness/spark-history-server_openshift-false_spark-4.0.0-preview1_s3-use-tls-true (1029.86s)
        --- FAIL: kuttl/harness/spark-ny-public-s3_openshift-false_spark-4.0.0-preview1_s3-use-tls-true (60.13s)
        --- FAIL: kuttl/harness/spark-ny-public-s3_openshift-false_spark-4.0.0-preview1_s3-use-tls-false (59.66s)

Iceberg was e.g related when looking for an iceberg jar that simply does not exist yet.

:: org.apache.iceberg#iceberg-spark-runtime-4.0_2.12;1.4.0: not found

This is just for future reference and nothing to act on for now.