apache / kyuubi

Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
https://kyuubi.apache.org/
Apache License 2.0
2.07k stars 902 forks source link

[Bug] Query failed with NoSuchFieldError: LZ4_RAW #6050

Closed awol2005ex closed 7 months ago

awol2005ex commented 7 months ago

Code of Conduct

Search before asking

Describe the bug

when iceberg table merge throws Exception

MERGE INTO test1.dwd_sl_car_order_iceberg1 a
USING test1.dwd_sl_car_order  b
ON a.order_id = b.order_id 
WHEN MATCHED THEN
  UPDATE SET
    a.order_id = b.order_id,
    .......
    a.dt = b.dt
WHEN NOT MATCHED
  THEN INSERT (
    order_id,...., dt
  )
  VALUES (
    b.order_id,..... b.dt
  )

Error executing query:

SQL Error: org.apache.kyuubi.KyuubiSQLException: org.apache.kyuubi.KyuubiSQLException: Error operating ExecuteStatement: java.lang.NoSuchFieldError: LZ4_RAW
    at org.apache.spark.sql.execution.datasources.parquet.ParquetOptions$.<init>(ParquetOptions.scala:104)
    at org.apache.spark.sql.execution.datasources.parquet.ParquetOptions$.<clinit>(ParquetOptions.scala)
    at org.apache.spark.sql.hive.HiveMetastoreCatalog.convert(HiveMetastoreCatalog.scala:137)
    at org.apache.spark.sql.hive.RelationConversions$$anonfun$apply$4.applyOrElse(HiveStrategies.scala:239)
    at org.apache.spark.sql.hive.RelationConversions$$anonfun$apply$4.applyOrElse(HiveStrategies.scala:226)

Affects Version(s)

1.8.0 spark3.5.0

Kyuubi Server Log Output

No response

Kyuubi Engine Log Output

Error executing query:
SQL Error: org.apache.kyuubi.KyuubiSQLException: org.apache.kyuubi.KyuubiSQLException: Error operating ExecuteStatement: java.lang.NoSuchFieldError: LZ4_RAW
    at org.apache.spark.sql.execution.datasources.parquet.ParquetOptions$.<init>(ParquetOptions.scala:104)
    at org.apache.spark.sql.execution.datasources.parquet.ParquetOptions$.<clinit>(ParquetOptions.scala)
    at org.apache.spark.sql.hive.HiveMetastoreCatalog.convert(HiveMetastoreCatalog.scala:137)
    at org.apache.spark.sql.hive.RelationConversions$$anonfun$apply$4.applyOrElse(HiveStrategies.scala:239)
    at org.apache.spark.sql.hive.RelationConversions$$anonfun$apply$4.applyOrElse(HiveStrategies.scala:226)

Kyuubi Server Configurations

No response

Kyuubi Engine Configurations

kyuubi.engine.yarn.cores=3
kyuubi.engine.yarn.memory=8092
spark.executor.memory=8g
hadoop.security.authentication=KERBEROS
kyuubi.kinit.principal=hive@XXX.COM
kyuubi.kinit.keytab=/opt/hive.keytab
#spark.yarn.principal=hive@XXX.COM
spark.yarn.keytab=/opt/hive.keytab
spark.kerberos.principal=hive@XXX.COM
spark.kerberos.keytab=/opt/hive.keytab
hive.metastore.kerberos.principal=hive/_HOST@XXX.COM
hive.metastore.kerberos.keytab=/opt/hive.keytab
spark.master=yarn
spark.yarn.queue=default
kyuubi.zookeeper.embedded.client.port=23181
spark.yarn.appMasterEnv.PYSPARK_PYTHON=/usr/bin/python3.6-spark
spark.yarn.appMasterEnv.PYSPARK_DRIVER_PYTHON=/usr/bin/python3.6-spark
spark.jars.packages=org.apache.iceberg:iceberg-spark-runtime-3.2_2.12:1.4.3,org.apache.doris:spark-doris-connector-3.2_2.12:1.3.0,com.ibm.db2:jcc:11.5.9.0,com.mysql:mysql-connector-j:8.0.33,com.oracle.database.jdbc:ojdbc8:23.3.0.23.09,com.microsoft.sqlserver:mssql-jdbc:12.4.2.jre8,com.vertica.jdbc:vertica-jdbc:23.4.0-0,org.apache.kudu:kudu-client:1.17.0,org.apache.kudu:kudu-spark3_2.12:1.17.0,org.apache.kyuubi:kyuubi-extension-spark-jdbc-dialect_2.12:1.8.0,org.elasticsearch:elasticsearch-spark-30_2.12:8.6.1
spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions,org.apache.spark.sql.dialect.KyuubiSparkJdbcDialectExtension
spark.sql.catalog.spark_catalog=org.apache.iceberg.spark.SparkSessionCatalog
spark.executor.extraJavaOptions=-Djava.security.krb5.conf=/opt/krb5.conf
spark.files=/opt/krb5.conf
spark.redaction.string.regex=\\s*pass\\s*[\\s|\\S]*$
spark.executor.instances=10

Additional context

No response

Are you willing to submit PR?

github-actions[bot] commented 7 months ago

Hello @awol2005ex, Thanks for finding the time to report the issue! We really appreciate the community's efforts to improve Apache Kyuubi.

pan3793 commented 7 months ago

It's a Spark side bug. https://issues.apache.org/jira/browse/SPARK-45484

awol2005ex commented 7 months ago

I compile spark3.5.0 myself ,and use spark submit + pyspark to execute this sql is ok , but on kyuubi throws the same exception

pan3793 commented 7 months ago

If you change something, you can not claim it as 3.5.0, It absolutely misleads the volunteer who helps you dig out the issue, and wastes both time.

https://github.com/apache/kyuubi/discussions/2481

Kyuubi just calls Spark runtime to complete the query, you may need to provide reproducible steps from scratch for further investigation.

awol2005ex commented 7 months ago

There‘s not built package of spark3.5.0 for CDH7 compatible,I just only to compile myself and fixed many bug for it.