nexr / RHive

RHive is an R extension facilitating distributed computing via Apache Hive.
http://nexr.github.io/RHive
122 stars 63 forks source link

Cannot modify RHIVE_UDF_DIR at runtime #92

Closed shashivish closed 8 years ago

shashivish commented 8 years ago

Hi

I have installed rhive on hdp sandbox and trying to perform some queries on hive but getting below error.

library(RHive) rhive.init() rhive.connect()

2015-09-17 18:29:09,526 INFO [Thread-6] jdbc.Utils (Utils.java:parseURL(285)) - Supplied authorities: 127.0.0.1:10000 2015-09-17 18:29:09,526 INFO [Thread-6] jdbc.Utils (Utils.java:parseURL(372)) - Resolved authority: 127.0.0.1:10000 2015-09-17 18:29:09,529 INFO [Thread-6] jdbc.HiveConnection (HiveConnection.java:openTransport(189)) - Will try to open client transport with JDBC Uri: jdbc:hive2://127.0.0.1:10000/default Error: org.apache.hive.service.cli.HiveSQLException: Error while processing statement: Cannot modify RHIVE_UDF_DIR at runtime. It is not in list of params that are allowed to be modified at runtime

Any suggestion ?

ghost commented 8 years ago

Hi @shashivish RHive set the property in the runtime. So you should add the property to 'hive.security.authorization.sqlstd.confwhitelist'. See https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.security.authorization.sqlstd.confwhitelist

Thanks.

shashivish commented 8 years ago

HI @DrakeMin I have added property hive.security.authorization.sqlstd.confwhitelist in hive-site.xml with comma separated list of values mentioned in link. Now RHIVE_UDF_DIR error is gone but getting new exception as below.

Error: org.apache.hive.service.cli.HiveSQLException: Error while processing statement: Cannot modify hive.exec.compress.output at runtime. It is not in list of params that are allowed to be modified at runtime

I checked for hive.exec.compress.output and it is there in list of values provided for property hive.security.authorization.sqlstd.confwhitelist.

Any pointers here?

Thanks

ghost commented 8 years ago

Hi @shashivish Use this command to get current whitelist:

set hive.security.authorization.sqlstd.confwhitelist

Check the returned list, it should be include hive.exec.compress.output. If not, I think just add the property to the conf.

Thanks

shashivish commented 8 years ago

HI @DrakeMin . I just fired command in hive shell to get list. Here is output of set hive.security.authorization.sqlstd.confwhitelist command.

hive.security.authorization.sqlstd.confwhitelist=hive.exec.reducers.bytes.per.reducer,hive.exec.reducers.max,hive.map.aggr,hive.map.aggr.hash.percentmemory,hive.map.aggr.hash.force.flush.memory.threshold,hive.map.aggr.hash.min.reduction,hive.groupby.skewindata,hive.optimize.multigroupby.common.distincts,hive.optimize.index.groupby,hive.optimize.ppd,hive.optimize.ppd.storage,hive.ppd.recognizetransivity,hive.optimize.groupby,hive.optimize.sort.dynamic.partition,hive.optimize.union.remove,hive.multigroupby.singlereducer,hive.map.groupby.sorted,hive.map.groupby.sorted.testmode,hive.optimize.skewjoin,hive.optimize.skewjoin.compiletime,hive.mapred.mode,hive.enforce.bucketmapjoin,hive.exec.compress.output,hive.exec.compress.intermediate,hive.exec.parallel,hive.exec.parallel.thread.number,hive.exec.rowoffset,hive.merge.mapfiles,hive.merge.mapredfiles,hive.merge.tezfiles,hive.ignore.mapjoin.hint,hive.auto.convert.join,hive.auto.convert.join.noconditionaltask,hive.auto.convert.join.noconditionaltask.size,hive.auto.convert.join.use.nonstaged,hive.enforce.bucketing,hive.enforce.sorting,hive.enforce.sortmergebucketmapjoin,hive.auto.convert.sortmerge.join,hive.execution.engine,hive.vectorized.execution.enabled,hive.mapjoin.optimized.keys,hive.mapjoin.lazy.hashtable,hive.exec.check.crossproducts,hive.compat,hive.exec.dynamic.partition.mode,mapred.reduce.tasks,mapred.output.compression.codec,mapred.map.output.compression.codec,mapreduce.job.reduce.slowstart.completedmaps,mapreduce.job.queuename

It contains hive.exec.compress.output property.

taeyoung-yoon commented 8 years ago

Hi @shashivish, Unfortunately, the current Hive wiki is incorrect. You should use java regex but not the comma separated list of values. Please refer to the details on the following Hive Jira. https://issues.apache.org/jira/browse/HIVE-8937

And i think it would be better to set 'hive.security.authorization.sqlstd.confwhitelist.append' rather than 'hive.security.authorization.sqlstd.confwhitelist'. It is because of setting 'hive.security.authorization.sqlstd.confwhitelist' will overwrite the default settings. So it will be enough to add just a few of necessary commands with java regular expressions to 'hive.security.authorization.sqlstd.confwhitelist.append'.

shashivish commented 8 years ago

Thanks @taeyoung-yoon . I updated hive.security.authorization.sqlstd.confwhitelist.append with list of values. Here is an output of set command.

hive> set hive.security.authorization.sqlstd.confwhitelist.append; hive.security.authorization.sqlstd.confwhitelist.append=mapred\.child\.env|hive\.exec\.reducers\.bytes\.per\.reducer|hive\.exec\.reducers\.max|hive\.map\.aggr|hive\.map\.aggr\.hash\.percentmemory|hive\.map\.aggr\.hash\.force\.flush\.memory\.threshold|hive\.map\.aggr\.hash\.min\.reduction|hive\.groupby\.skewindata|hive\.optimize\.multigroupby\.common\.distincts|hive\.optimize\.index\.groupby|hive\.optimize\.ppd|hive\.optimize\.ppd\.storage|hive\.ppd\.recognizetransivity|hive\.optimize\.groupby|hive\.optimize\.sort\.dynamic\.partition|hive\.optimize\.union\.remove|hive\.multigroupby\.singlereducer|hive\.map\.groupby\.sorted|hive\.map\.groupby\.sorted\.testmode|hive\.optimize\.skewjoin|hive\.optimize\.skewjoin\.compiletime|hive\.mapred\.mode|hive\.enforce\.bucketmapjoin|hive\.exec\.compress\.output|hive\.exec\.compress\.intermediate|hive\.exec\.parallel|hive\.exec\.parallel\.thread\.number|hive\.exec\.rowoffset|hive\.merge\.mapfiles|hive\.merge\.mapredfiles|hive\.merge\.tezfiles|hive\.ignore\.mapjoin\.hint|hive\.auto\.convert\.join|hive\.auto\.convert\.join\.noconditionaltask|hive\.auto\.convert\.join\.noconditionaltask\.size|hive\.auto\.convert\.join\.use\.nonstaged|hive\.enforce\.bucketing|hive\.enforce\.sorting|hive\.enforce\.sortmergebucketmapjoin|hive\.auto\.convert\.sortmerge\.join|hive\.execution\.engine|hive\.vectorized\.execution\.enabled|hive\.mapjoin\.optimized\.keys|hive\.mapjoin\.lazy\.hashtable|hive\.exec\.check\.crossproducts|hive\.compat|hive\.exec\.dynamic\.partition\.mode|mapred\.reduce\.tasks|mapred\.output\.compression\.codec|mapred\.map\.output\.compression\.codec|mapreduce\.job\.reduce\.slowstart\.completedmaps|mapreduce\.job\.queuename

After doing this i got below exception again.

Error: org.apache.hive.service.cli.HiveSQLException: Error while processing statement: Cannot modify RHIVE_UDF_DIR at runtime. It is not in list of params that are allowed to be modified at runtime Let me know if I am doing anything wrong.

Thanks

ghost commented 8 years ago

@shashivish IMHO, your config file looks like this:

<property>
  <name>hive.security.authorization.sqlstd.confwhitelist.append</name>
  <value>RHIVE_UDF_DIR</value>
</property>

hive.security.authorization.sqlstd.confwhitelist has default values.

Thanks

shashivish commented 8 years ago

Hi @DrakeMin I do agree with your point but as per comment by @taeyoung-yoon the values for hive.security.authorization.sqlstd.confwhitelist.append should be in form of regex because of which I changed it.

Even Jira ticket also tell the same.

https://issues.apache.org/jira/browse/HIVE-8937

Can you please let me know if that is the case then how should i update hive-site.xml?

Thanks Shashi

taeyoung-yoon commented 8 years ago

Hi @Shashi, IMO, you don't need to set any value to the 'hive.security.authorization.sqlstd.confwhitelist' property of your hive-site.xml, which allows for you to use default hive commands. And you have any special commands which are not included into the default command list such as RHIVE_UDF_DIR, just add the commands as java regex to the" hive.security.authorization.sqlstd.confwhitelist.append" property in your hive-site.xml. For example, you should run the following commands which are necessary for using rhive. set mapred.child.env = 'foo bar'; set RHIVE_UDF_DIR = '/usr/lib/x/y'; set query.invoker = 'user1';

You just need the hive configuration as below.

hive.security.authorization.sqlstd.confwhitelist hive.security.authorization.sqlstd.confwhitelist.append query.invoker|RHIVE_UDF_DIR

Thanks, Taeyoung

shashivish commented 8 years ago

Thank you very much @taeyoung-yoon and @DrakeMin . Issue got resolved. It was a great help. :+1: