Open malopezh opened 2 months ago
thanks for reporting this. can you add an example code of how you created the table?
thanks for reporting this. can you add an example code of how you created the table?
Hello!
Sure here you have the code:
` schema = Schema( NestedField(field_id=1, name="datetime", field_type=StringType(), required=False,current_schema=1), NestedField(field_id=2, name="symbol", field_type=StringType(), required=False,current_schema=1), NestedField(field_id=3, name="bid", field_type=FloatType(), required=False,current_schema=1), NestedField(field_id=4, name="ask", field_type=DoubleType(), required=False,current_schema=1), )
partition_spec = PartitionSpec( PartitionField( source_id=1, field_id=1000, transform=DayTransform(), name="datetime_day" ) )
from pyiceberg.table.sorting import SortOrder, SortField from pyiceberg.transforms import IdentityTransform sort_order = SortOrder(SortField(source_id=2, transform=IdentityTransform()))
identifier = ("iceberg", "default")
tbl = local_catalog.create_table_if_not_exists(identifier=identifier, schema=schema, location="s3a://my_oci_bucket/my_folder", partition_spec=partition_spec, sort_order=sort_order, properties={})
tbl.overwrite(df) `
NOTE: metadata is being created successfully
Thanks!!
line 3457, in recv_create_table raise result.o3 hive_metastore.ttypes.MetaException: MetaException(message='java.lang.IllegalArgumentException: bucket is null/empty')
This error is not from pyiceberg, but possibly from your underlying (hadoop) fs https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AUtils.java#L1102
catalog_type= "hadoop",
This is also not a valid catalog type https://github.com/apache/iceberg-python/blob/40357476ad0a79f7486b96a6e29b404bc699b70d/pyiceberg/catalog/__init__.py#L177-L183
"bucket_name": "s3a://my_bucket",
bucket_name
is not a valid parameter for catalog
https://py.iceberg.apache.org/configuration/#catalogs
uri="thrift://localhost:9083",
Is this a HMS? I think the error is from the HMS setup
uri="thrift://localhost:9083",
Is this a HMS? I think the error is from the HMS setup
Yes it's HMS. I configured Hadoop, Hive and HiveMetaStore service also I configured MySQL. I was able to create a new namespace with local_catalog.create_namespace("myNS")
but it's obvious that I missed something.
My intention is creating Iceberg Tables in OCI Object Storage. Is there any documentation I can check to achieve this?
Thanks for your responses.
My intention is creating Iceberg Tables in OCI Object Storage. Is there any documentation I can check to achieve this?
I don't know any OCI related documentation. However, here's one on setting up a catalog and writing to it. https://py.iceberg.apache.org/#connecting-to-a-catalog
I suggest getting that working and then replacing the catalog with your own.
Since you are using HMS, you should be using the Hive Catalog https://py.iceberg.apache.org/configuration/#hive-catalog or similarly
load_catalog(..., catalog_type= "hive")
Apache Iceberg version
0.7.1 (latest release)
Please describe the bug 🐞
Problem: Trying to create table in OCI Object Storage. Metadata is successfully created but data is not. Expected: Iceberg structure created but just metadata is being created. StackTrace:
Traceback (most recent call last): File "/home/marcolo/development/reorgParquets/.venv/lib/python3.10/site-packages/pyiceberg/catalog/__init__.py", line 418, in create_table_if_not_exists return self.create_table(identifier, schema, location, partition_spec, sort_order, properties) File "/home/marcolo/development/reorgParquets/.venv/lib/python3.10/site-packages/pyiceberg/catalog/hive.py", line 376, in create_table self._create_hive_table(open_client, tbl) File "/home/marcolo/development/reorgParquets/.venv/lib/python3.10/site-packages/pyiceberg/catalog/hive.py", line 325, in _create_hive_table open_client.create_table(hive_table) File "/home/marcolo/development/reorgParquets/.venv/lib/python3.10/site-packages/hive_metastore/ThriftHiveMetastore.py", line 3431, in create_table self.recv_create_table() File "/home/marcolo/development/reorgParquets/.venv/lib/python3.10/site-packages/hive_metastore/ThriftHiveMetastore.py", line 3457, in recv_create_table raise result.o3 hive_metastore.ttypes.MetaException: MetaException(message='java.lang.IllegalArgumentException: bucket is null/empty')
Iceberg Catalog:
`local_catalog = load_catalog(name='s3', uri="thrift://localhost:9083", warehouse= "s3a://my_bucket", catalog_type= "hadoop",