OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.
Describe the bug
Metadata ingestion fails for Iceberg tables with nested partition column.
To Reproduce
Data ingestion works for this table:
CREATE TABLE catalog1.db1.table1 (a STRUCT<b: STRING>, b STRING) PARTITIONED BY (b)
Data ingestion fails for this table:
CREATE TABLE catalog1.db1.table1 (a STRUCT<b: STRING>, b STRING) PARTITIONED BY (a.b)
Error:
[2024-10-31T13:58:37.779+0000] {status.py:91} WARNING - Failed to ingest CreateTableRequest [table1] due to api request failure: Invalid column name found in table partition
[2024-10-31T13:58:37.779+0000] {status.py:92} DEBUG - Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/ometa/client.py", line 243, in _one_request
resp.raise_for_status()
File "/home/airflow/.local/lib/python3.10/site-packages/requests/models.py", line 1021, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: http://openmetadata-server:8585/api/v1/tables
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/sink/metadata_rest.py", line 146, in _run
return self._run_dispatch(record)
File "/usr/local/lib/python3.10/functools.py", line 926, in _method
return method.__get__(obj, cls)(*args, **kwargs)
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/sink/metadata_rest.py", line 137, in _run_dispatch
return self.write_create_request(record)
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/sink/metadata_rest.py", line 167, in write_create_request
created = self.metadata.create_or_update(entity_request)
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/ometa/ometa_api.py", line 280, in create_or_update
return self._create(data=data, method="put")
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/ometa/ometa_api.py", line 271, in _create
resp = fn(self.get_suffix(entity), data=data.model_dump_json())
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/utils/execution_time_tracker.py", line 195, in inner
result = func(*args, **kwargs)
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/ometa/client.py", line 324, in put
return self._request("PUT", path, data)
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/ometa/client.py", line 212, in _request
return self._one_request(method, url, opts, retry)
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/ometa/client.py", line 263, in _one_request
raise APIError(error, http_error) from http_error
metadata.ingestion.ometa.client.APIError: Invalid column name found in table partition
Expected behavior
Data ingestion works for table with nested partition column.
Affected module Ingestion Framework
Describe the bug Metadata ingestion fails for Iceberg tables with nested partition column.
To Reproduce Data ingestion works for this table:
CREATE TABLE catalog1.db1.table1 (a STRUCT<b: STRING>, b STRING) PARTITIONED BY (b)
Data ingestion fails for this table:
CREATE TABLE catalog1.db1.table1 (a STRUCT<b: STRING>, b STRING) PARTITIONED BY (a.b)
Error:
Expected behavior Data ingestion works for table with nested partition column.
Version: