trinodb / trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
https://trino.io
Apache License 2.0
10.36k stars 2.98k forks source link

Disable world-writable directory of CREATE TABLE in Hive connector #1379

Closed ebyhr closed 4 years ago

ebyhr commented 5 years ago

Currently, the directory permission created by Hive connector's CREATE TABLE is world-writable. This is explicitly set after this commit https://github.com/prestosql/presto/commit/0a446bb6f2c64f02282fc9048aa35a382a5c3087.

We already have hive.hdfs.impersonation.enabled property, so I assume we can replace the permission with more strict one.

hdp3.1-hive:24: rwxr-xr-x

0: jdbc:hive2://localhost:10000/default> create table test_hive (c1 int);

[root@hadoop-master /]# hdfs dfs -ls -d hdfs://hadoop-master:9000/user/hive/warehouse/test_hive
drwxr-xr-x   - root supergroup          0 2020-01-29 18:49 hdfs://hadoop-master:9000/user/hive/warehouse/test_hive
presto:default> create table hive.default.test_presto (c1 int);

[root@hadoop-master /]# hdfs dfs -ls -d hdfs://hadoop-master:9000/user/hive/warehouse/test_presto
drwxrwxrwx   - hive supergroup          0 2020-01-29 18:50 hdfs://hadoop-master:9000/user/hive/warehouse/test_presto
findepi commented 5 years ago

As a Hive connector design rule of thumb, we should check what Hive is doing (how configurable and what's the default). I vaguely remember the answer may depend on Hive version (I expect Hive 3 to use more strict permissions) and is very likely configurable via some umask config.

findepi commented 4 years ago

See https://github.com/prestosql/presto/pull/3126

findepi commented 4 years ago

Covered by https://github.com/prestosql/presto/pull/3126. Please reopen if still relevant.