Open pulkit-cldcvr opened 1 year ago
I have the exact same issue for my unit tests under Spark 3.4 / Iceberg 1.3. Everything works well but those CALL
calls or ALTER TABLE ... ADD|DROP PARTITION FIELD ...
But the spark.sql.extensions
is correctly set as described by @pulkit-cldcvr .
@pulkit-cldcvr the issue happens only in pyspark or spark-shell as well?
@manuzhang I am experiencing this in pyspark Jupyter notebook using Spark 3.4.1 on EMR Studio workspace.
+1
Quick Guess on what might be going wrong, My assumption would be the session being used is not actually loaded with the extensions. I've seen this happen in a few different instances,
+1
I am facing the same issue in pyspark - when creating external tables in hive using ICEBERG format.
[PARSE_SYNTAX_ERROR] Syntax error at or near 'ICEBERG'.(line 1, pos 42)
== SQL ==
CREATE EXTERNAL TABLE x (i int) STORED BY ICEBERG;
------------------------------------------^^^
in my case, adding spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions
solve the problem
in my case, adding
spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions
solve the problem
hi bro, I face same problem too butI am running pyspark in colab, how could run this command?
Facing same issues as below while try to use expire snapshots in glue version 4.0, added spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions as well, is there any workaround?
spark.sql("""CALL catalog_name.system.expire_snapshots('db_name.table_name')""")'
pyspark.sql.utils.ParseException: Syntax error at or near 'CALL'
"spark.sql.extensions", "org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions"
helps.
"spark.sql.extensions", "org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions"
helps.
Yes, I have added all the below config, but still no luck, conf.set("spark.sql.catalog.job_catalog", "org.apache.iceberg.spark.SparkCatalog")
conf.set("spark.sql.catalog.job_catalog.warehouse", args['iceberg_job_catalog_warehouse']) conf.set("spark.sql.catalog.job_catalog.catalog-impl", "org.apache.iceberg.aws.glue.GlueCatalog") conf.set("spark.sql.catalog.job_catalog.type", "glue") conf.set("spark.sql.catalog.job_catalog.io-impl", "org.apache.iceberg.aws.s3.S3FileIO") conf.set("spark.sql.extensions", "org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions") conf.set("spark.hadoop.fs.s3.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem")
Query engine
Spark
Question
I am trying to use Iceberg glue catalog to integrate with spark. However I am able to query the table data but not able to run procedures.
Exception:- pyspark.sql.utils.ParseException: Syntax error at or near 'CALL'
Spark Config:-