ExpediaGroup / waggle-dance

Hive federation service. Enables disparate tables to be concurrently accessed across multiple Hive deployments.
Apache License 2.0
273 stars 76 forks source link

Inaccurate schema error when same exists in primary Metastore #134

Closed ambrish29 closed 6 years ago

ambrish29 commented 6 years ago

Assume there is a schema in primary metastore: pm_mydb

In different hive metastore, there is schema named as mydb_another.

When configuring waggle dance, we have provided prefix for external metastore as pm_.

When we ran show databases, we get following result

hive> show databases;
pm_mydb
pm_mydb_another

After this when we tried to execute query against the pm_mydb, schema within primary MS, I am getting error

hive> use  pm_mydb;

ERROR: FAILED: SemanticException [Error 10072]: Database does not exist: pm_mydb
...

When I changed the prefix to pm2_ for external MS, everything works fine.

hive> show databases;
pm_mydb
pm2_mydb_another

hive> use  pm_mydb;
LIST_OF_TABLES

It looks like that waggle dance is resolving pm_mydb as external schema (mydb schema under the MS with pm_ as prefix).

patduin commented 6 years ago

This is expected behaviour. Waggle Dance uses the prefix to redirect to the external metastore. The pm_ prefix is configured to go to the federated metastore. Which is in turn tell you that pm_mydb (or mydb in the external MS) doesn't exist. This is a weird situation I suspect because the database was created bypassing Waggle Dance. Best to pick a prefix that doesn't clash with existing databases. (or rename your database)

If this is not an option there is the alternative to switch to MANUAL database resolution mode (README has more details). This doesn't use prefixes.

massdosage commented 6 years ago

Closing this issue for now as it's expected behaviour. If we see users reporting this we could update the documentation to be clear that one shouldn't choose a prefix that matches the prefix of a primary database name.

ambrish29 commented 6 years ago

This is expected behaviour. Waggle Dance uses the prefix to redirect to the external metastore. The pm_ prefix is configured to go to the federated metastore. Which is in turn tell you that pm_mydb (or mydb in the external MS) doesn't exist. This is a weird situation I suspect because the database was created bypassing Waggle Dance. Best to pick a prefix that doesn't clash with existing databases. (or rename your database)

If this is not an option there is the alternative to switch to MANUAL database resolution mode (README has more details). This doesn't use prefixes.

All the databases were created historically and when waggle-dance was configured, we chose the prefix according to best suitable meaning.

This caused us downtime on 1 of the cluster and took us sometime to figure out root cause and fix the prefix in order to make system up again.

It will be great to make it explicit in documentation as anyone can hit this issue (knowingly or unknowingly).

massdosage commented 6 years ago

@ambrish29 I've had a go at adding this to the wording, it's a bit hard to explain but let me know if you think this helps: https://github.com/HotelsDotCom/waggle-dance/pull/138 (feel free to add comments on the PR if you're able to, otherwise put them here below).

ambrish29 commented 6 years ago

@massdosage PR looks good to me.