cloudera-labs / hive-sre

Apache License 2.0
17 stars 16 forks source link

Validate Hive location path includes nameservice name #12

Open randygoering opened 2 years ago

randygoering commented 2 years ago

Check Hive table location path is valid in HDFS.

We had a cluster that at one time had HDFS HA enabled and tables were defined with a location that included the nameservice name. When HDFS HA was disable the hive table location was not updated to reflect the correct HDFS path. HSMM failed during migration due to some of the hive tables had incorrect hdfs location defined.

It appears that the hive-sre tool did not report that these tables had invalid hdfs locations.

randygoering commented 1 year ago

Citi had issues with location missing hdfs://nameservice-example/data/ SDS table in metastore had a location for the table so it did not get flagged as missing but it was incorrect and caused failure in hive metadata schema validation. Location should be looks like this: hdfs://nameservice-example/data/... Support Case 911529

mszurap commented 2 months ago

For the missing nameservice id, besides validating the SDS entries (table and partition locations) the DBS table also needs to be validated. SELECT * from SDS where LOCATION NOT LIKE 'hdfs://%' and LOCATION NOT LIKE 'har://%' ; and SELECT DB_ID,NAME,DB_LOCATION_URI from DBS where DB_LOCATION_URI NOT LIKE 'hdfs://%'; See also case 1057442. Could you help with this @dstreev ?