Closed arunb2w closed 1 year ago
This is because glueCatalog has additonal vaildations on tablename it should only contain lower case alphabets. https://github.com/apache/iceberg/blob/6d2edd6284ebc5301dbe45376a31ca8316852a77/aws/src/main/java/org/apache/iceberg/aws/glue/GlueCatalog.java#L499-L506
can try setting glue.skip-name-validation
via catalog properties if you wanna skip these validations :
https://github.com/apache/iceberg/blob/6d2edd6284ebc5301dbe45376a31ca8316852a77/aws/src/main/java/org/apache/iceberg/aws/AwsProperties.java#L106-L114
can try setting glue.skip-name-validation via catalog properties if you wanna skip these validations :
It is very hard to figure out how to set these propertes. Could you please share small example? I have tried spar.glue.skip-name-validation
or spark.sql.glue.skip-name-validation
or spark.sql.catalog.my_catalog.glue.skip-name-validation
and have no luck :-(
ideally
--conf spark.sql.catalog.{catalog_name}.glue.skip-name-validation=false
should have worked, can you please add the complete spark conf's you are giving and also iceberg version your are trying it with.
Note: this was added in iceberg 0.14.0 release
This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.
This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale'
Apache Iceberg version
0.14.0
Query engine
EMR
Please describe the bug 🐞
Facing error when creating iceberg table in EMR using Glue catalog. spark version : 3.2.1 iceberg version: 0.14.0
Sample code:
Spark command used to run:
spark-submit --deploy-mode cluster--packages org.apache.iceberg:iceberg-spark-runtime-3.2_2.12:0.14.0,software.amazon.awssdk:bundle:2.17.257,software.amazon.awssdk:url-connection-client:2.17.257 --conf spark.yarn.submit.waitAppCompletion=true --conf "spark.executor.extraJavaOptions=-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=\"/opt/spark\"" --conf spark.dynamicAllocation.enabled=true --conf spark.executor.maxMemory=32g --conf spark.dynamicAllocation.executorIdleTimeout=300 --conf spark.shuffle.service.enabled=true --driver-memory 8g --num-executors 1 --executor-memory 32g --executor-cores 5 iceberg_main.py
Error stacktrace:
Please provide insights on what am missing. The same code works fine, if i use hadoop catalog instead of Glue