open-metadata / OpenMetadata

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.
https://open-metadata.org
Apache License 2.0
5.01k stars 953 forks source link

Sample Data Ingestion: Can not ingest tables with complex data types #16983

Open nqvuong1998 opened 1 month ago

nqvuong1998 commented 1 month ago

Is your feature request related to a problem? Please describe. When ingesting sample data from Hive tables using Trino, we encounter an error: "Error trying to ingest sample data for table" when dealing with tables that have complex data types.

Describe the solution you'd like There are 2 solutions:

  1. When displaying sample data from Hive tables with complex data types such as struct, map, and array, it should match the schema structure.
  2. Convert complex sample data to a JSON string, and display it in one column, you can use a JSON representation for each row's complex data types.
nqvuong1998 commented 1 month ago

cc @ayush-shah

chuqbach commented 1 month ago

Same issue here, it seems like the complex data type is not processed in OpenMetadata. Hue/Impala connector doesn't even process the complex data type at all, while Trino only processes it for Schema but not for Sample data/Data Profiler.