apache / iceberg

Apache Iceberg
https://iceberg.apache.org/
Apache License 2.0
6.13k stars 2.13k forks source link

Write fails with class cast exception. #3202

Closed vamosraghava closed 2 weeks ago

vamosraghava commented 2 years ago

The example that i am using is under "hadoop tables" on https://cloud.google.com/dataproc-metastore/docs/apache-iceberg

df.write.format("iceberg").mode("overwrite").save("gs://blahblah/iceberg_test/dest/");

I get fat error as below. I am running it on my jupyter notebook on a spylon kernel.

%%init_spark

launcher.packages = "org.apache.spark:spark-avro_2.12:2.4.6,org.apache.spark:spark-sql-kafka-0-10_2.12:2.4.7,org.apache.iceberg:iceberg-spark-runtime:0.12.0"

Exception:

org.apache.spark.SparkException: Writing job failed.
  at org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec.doExecute(WriteToDataSourceV2Exec.scala:87)
  at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:136)
  at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:160)
  at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
  at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:157)
  at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:132)
  at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:83)
  at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:81)
  at org.apache.spark.sql.DataFrameWriter.$anonfun$runCommand$1(DataFrameWriter.scala:696)
  at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:80)
  at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:127)
  at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:75)
  at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:696)
  at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:281)
  at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:249)
  ... 43 elided
Caused by: java.lang.ClassCastException: org.apache.iceberg.spark.source.Writer$TaskCommit cannot be cast to org.apache.iceberg.spark.source.Writer$TaskCommit
  at org.apache.iceberg.spark.source.Writer.lambda$files$3(Writer.java:221)
  at org.apache.iceberg.relocated.com.google.common.collect.Iterators$6.transform(Iterators.java:783)
  at org.apache.iceberg.relocated.com.google.common.collect.TransformedIterator.next(TransformedIterator.java:47)
  at org.apache.iceberg.relocated.com.google.common.collect.TransformedIterator.next(TransformedIterator.java:47)
  at org.apache.iceberg.relocated.com.google.common.collect.Iterators$ConcatenatedIterator.hasNext(Iterators.java:1330)
  at org.apache.iceberg.spark.source.Writer.replacePartitions(Writer.java:190)
  at org.apache.iceberg.spark.source.Writer.commit(Writer.java:145)
  at org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec.doExecute(WriteToDataSourceV2Exec.scala:76)
  ... 57 more
  Suppressed: java.lan

What could be the problem ? This cluster runs on spark 2.4.X.

github-actions[bot] commented 1 month ago

This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.

github-actions[bot] commented 2 weeks ago

This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale'