brayanjuls / diane

Hive helper functions for apache spark users
MIT License
10 stars 0 forks source link

Create command to show all Hive tables #2

Open brayanjuls opened 1 year ago

brayanjuls commented 1 year ago

It'd be useful to have a Hive command that provides a more elegant way to show all the Hive tables / associated metadata than what's outlined in this answer.

Perhaps HiveHelpers.allTables() that returns a list of HiveTable objects. The HiveTable object can contain the table name, HiveTableType, etc. I'm just making up abstractions that could be cool. Feel free to implement this however it's best abstracted.

This issue was migrated from https://github.com/MrPowers/jodie/issues/20

brayanjuls commented 1 year ago

@MrPowers - I am making progress on this issue, I am already generating this output when you call the function HiveHelpers.allTables(). Do you have in mind any additional columns?

+--------+--------------+--------+------------+----------------+-------------+--------+--------------------------------------------------------------------------------------------------------------------------------------------------------------+
|database|tableName     |provider|owner       |partitionColumns|bucketColumns|type    |detail                                                                                                                                                        |
+--------+--------------+--------+------------+----------------+-------------+--------+--------------------------------------------------------------------------------------------------------------------------------------------------------------+
|default |e_new_table   |delta   |brayan_jules|[]              |[]           |EXTERNAL|{tableProperties -> [delta.minReaderVersion=1,delta.minWriterVersion=2]}                                                                                      |
|default |lang_num_table|parquet |brayan_jules|[num]           |[]           |MANAGED |{inputFormat -> org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat, outputFormat -> org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat}|
|default |num_table     |delta   |brayan_jules|[]              |[]           |MANAGED |{tableProperties -> [delta.minReaderVersion=1,delta.minWriterVersion=2]}                                                                                      |
|default |p_e_new_table |parquet |brayan_jules|[num]           |[]           |EXTERNAL|{inputFormat -> org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat, outputFormat -> org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat}|
+--------+--------------+--------+------------+----------------+-------------+--------+--------------------------------------------------------------------------------------------------------------------------------------------------------------+
MrPowers commented 1 year ago

Wow that is a beautiful function!! So excited about this! I don't have any additional columns in mind, but just so excited about this functionality!