Open ryu3065 opened 1 year ago
Could you try using SparkSessionCatalog instead of SparkCatalog? in your config -
"spark.sql.catalog.spark_catalog":"org.apache.iceberg.spark.SparkSessionCatalog",
Could you try using SparkSessionCatalog instead of SparkCatalog? in your config -
"spark.sql.catalog.spark_catalog":"org.apache.iceberg.spark.SparkSessionCatalog",
@akshayakp97 Hi! Although I have set SparkSessionCatalog, it is not working my parquet table. but Iceberg tables works well.
spark.conf.get("spark.sql.catalog.spark_catalog")
res1: String = org.apache.iceberg.spark.SparkSessionCatalog
This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.
We are incurring in the same problem, @ryu3065 have you managed to find a solution? We would like to run a MERGE query that reads from Parquet tables and writes on an Iceberg table - but we didn't manage to make it work yet.
Hello,
I am using org.apache.iceberg.spark.SparkSessionCatalog instead of SparkCatalog. I am able to create both Iceberg and non-Iceberg tables on the Glue catalog. However, when I try to execute SHOW TABLE EXTENDED db_name LIKE (table_name) on an Iceberg table, it throws this error: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to fetch table. StorageDescriptor#InputFormat cannot be null for table:
I am using dbt Spark to load Iceberg tables on the Glue catalog. I don't have control over the SQL command which dbt generates to check table existence. Because of the error I am getting, I am unable to proceed further.
Any help on this issue would be helpful.
Query engine
Using
Spark-defaults
Question
The iceberg table works normally with the configurations I set, but the parquet table that was used previously does not work with the error message below.
Iceberg table
spark.sql("select * from db.iceberg_table ").show()
it works well.but,
Parquet table
spark.sql("select * from db.parquet_table ").show()
occur error._org.apache.iceberg.exceptions.ValidationException: Input Glue table is not an iceberg table: spark_catalog.db.parquet_table (type=null)_
So, I tried to set spark conf _spark.conf.set("spark.sql.catalog.spark_catalog.type", "hive"_ (hive/glue/hadoop or something) but same.
How to set Spark conf to use Parquet and Iceberg tables using glue catalog without catalog name(spark_catalog)?