apache / paimon

Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
https://paimon.apache.org/
Apache License 2.0
2.27k stars 911 forks source link

[Bug] Hive SQL "describe extended tableName" does not contain primary key information #2643

Open melin opened 8 months ago

melin commented 8 months ago

Search before asking

Paimon version

0.6

Compute Engine

Hive

Minimal reproduce step

Spark sql 创建表:

create table if not exists bigdata.paimon_sample (
    k int,
    v string
) USING paimon
tblproperties (
    'primary-key' = 'k'
)

执行 hive sql,没有显示primary-key 信息:

hive> describe extended paimon_sample;
OK
k                       int                     from deserializer   
v                       string                  from deserializer   

Detailed Table Information      Table(tableName:paimon_sample, dbName:bigdata, owner:melin, createTime:1704431463, lastAccessTime:1704431463, retention:2147483647, sd:StorageDescriptor(cols:[FieldSchema(name:k, type:int, comment:null), FieldSchema(name:v, type:string, comment:null)], location:hdfs://cdh1:8020/user/hive/warehouse/bigdata.db/paimon_sample, inputFormat:org.apache.paimon.hive.mapred.PaimonInputFormat, outputFormat:org.apache.paimon.hive.mapred.PaimonOutputFormat, compressed:false, numBuckets:0, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.paimon.hive.PaimonSerDe, parameters:{}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{}), storedAsSubDirectories:false), partitionKeys:[], parameters:{transient_lastDdlTime=1704431463, totalSize=340, storage_handler=org.apache.paimon.hive.PaimonStorageHandler, numFilesErasureCoded=0, numFiles=1}, viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE, ownerType:USER)
Time taken: 0.858 seconds, Fetched: 4 row(s)
hive> show create table paimon_sample;
OK
CREATE TABLE `paimon_sample`(
  `k` int COMMENT 'from deserializer', 
  `v` string COMMENT 'from deserializer')
ROW FORMAT SERDE 
  'org.apache.paimon.hive.PaimonSerDe' 
STORED BY 
  'org.apache.paimon.hive.PaimonStorageHandler' 

LOCATION
  'hdfs://cdh1:8020/user/hive/warehouse/bigdata.db/paimon_sample'
TBLPROPERTIES (
  'transient_lastDdlTime'='1704431463')
Time taken: 0.183 seconds, Fetched: 12 row(s)
hive> 

What doesn't meet your expectations?

执行describe extended paimon_sample ,返回信息包含主键字段

@YannByron

zhuangchong commented 8 months ago

We need to consider whether paimon table options should be in hive table parameters? Currently, when flink creates a table and uses hive to view the table, these attributes are also not visible.

melin commented 8 months ago

Collect table metadata to obtain primary key information of the table. hudi can obtain this information