clemlabprojects / ambari

Fork of Apache Ambari maintained by Clemlab Company
https://www.clemlab.com
Apache License 2.0
42 stars 17 forks source link

Hive external table repair partition takes a long time! #65

Closed SGITLOGIN closed 7 months ago

SGITLOGIN commented 7 months ago

@lucasbak ambari version:2.7.9.0.0-26 odp version:1.2.2.0-105

Hello, my ODP cluster has been set up. Hive data is stored on oss. It takes a long time to repair the partition of the external table. Why is this? I asked this question before. You said there were no problems with your tests. Have you done any optimization on hive parameters?

Time to repair partition 271s

image

The number of table partitions is 31

image

Create table statement

image
lucasbak commented 7 months ago

@SGITLOGIN ,

Could you check the log in hive metastore while the request is being executed ?

Not at all, we have tested with HDFS (supported FS), not OSS.

SGITLOGIN commented 7 months ago

@lucasbak I paid attention to the log. When this sentence was printed, the SQL to repair the table partition executed in the front desk was completed.

image
lucasbak commented 7 months ago

@SGITLOGIN try to disable stats

SGITLOGIN commented 7 months ago

@lucasbak Please give me some guidance on how to operate it, thank you.

lucasbak commented 7 months ago

set hive.stats.autogather=false un hive-site.xml

SGITLOGIN commented 7 months ago

@lucasbak hadoop can be integrated with oss, that is, you have not tested hive data on oss, create a hive external table and msck repair tablename, right?

image
SGITLOGIN commented 7 months ago

@lucasbak I tried, changing this parameter has no effect

set hive.stats.autogather=false un hive-site.xml

lucasbak commented 7 months ago

@SGITLOGIN ,

indeed we have tested with Apache HDFS + Apache Hive without OSS.

SGITLOGIN commented 7 months ago

@lucasbak The solution has been found, just set the hive.metastore.transactional.event.listeners and hive.metastore.event.listeners parameter values ​​to empty.