bloomberg / chef-bach

Chef recipes for Bloomberg's deployment of Hadoop and related components
Apache License 2.0
61 stars 66 forks source link

hive-site.xml is missing hive.exec.scratchdir parameter #26

Closed amithkanand closed 9 years ago

amithkanand commented 9 years ago

In current implementation hive-site.xml is missing hive.exec.scratchdir parameter. When a query is executed, using ODBC/JDBC that connects to a hiveserver2 process, temporary results are stored in directory specified by hive.exec.scratchdir parameter. Not having this parameter in configuration file causes all the temporary output to be redirected to /tmp/hive-{process owner} on hdfs and user executing the query gets a permission denied error. Specifying hive.exec.scratchdir and pointing it to "/tmp" on hdfs will fix the issue.

bijugs commented 9 years ago

We may want to keep the scratch directory to be user specific so that there is no data breach since data will be stored during query run which will be viewable to other users if the directory is not user specific. Also there is a possibility that the query data can stay in scratch if the client fails abnormally. We are creating a scratch directory drwxrwxrwt - hive supergroup 0 2014-11-13 10:40 in bcpc-hadoop::hive_metastore recipe. Can we set the scratch directory to /tmp/scratch/hive-${user.name}. It would be good to make /tmp/scratch an attribute since it will be used in two dependent places.

amithkanand commented 9 years ago

Since scratch directory is created automatically when user executes a query, we do not need bcpc-hadoop::hive_metastore to create that. Also using /tmp/scratch as base folder can cause issues as it will be owned by the user running the very first hive query. The correct solutions seems to be define scratch dir as /tmp/hive-${user.name} as /tmp is writeable by each user and user's scratch directory will be owned by him/her preventing accidental access to other's temp output. Also, by design all the temp output written to scratch directory is removed after query execution is complete.

amithkanand commented 9 years ago

Fixed in PR #36