nexr / RHive

RHive is an R extension facilitating distributed computing via Apache Hive.
http://nexr.github.io/RHive
122 stars 63 forks source link

rhive.export() - Environment Varialble issue #70

Open rajasekhariitbbs opened 9 years ago

rajasekhariitbbs commented 9 years ago

The rhive.export() is unable to store the RUDF in hdfs path /rhive/udf/username, instead it is being saved in /rhive/udf/username in file system; if I copy the RUDF in hdfs the query is functioning or else it is throwing an error. It would be great if you provide a solution

ssshow16 commented 9 years ago

Hi.

I guess that RHive didn't find the Hadoop Configuration. You can check HADOOP_HOME by using rhive.env() function. Please check if HADOOP_HOME is a valid path.

If HADOOP_HOME is invalid, call Sys.setenv(HADOOP_HOME="/xxx/hadoop-xxx") There is more information in following link. https://github.com/nexr/RHive/wiki/User-Guide

Thanks.

On Thu, Oct 23, 2014 at 1:58 AM, rajasekhariitbbs notifications@github.com wrote:

The rhive.export() is unable to store the RUDF in hdfs path /rhive/udf/username, instead it is being saved in /rhive/udf/username in file system; if I copy the RUDF in hdfs the query is functioning or else it is throwing an error. It would be great if you provide a solution

— Reply to this email directly or view it on GitHub https://github.com/nexr/RHive/issues/70.

ssshow16 commented 9 years ago

After calling Sys.setenv(), you must call rhive.init() again.

rajasekhariitbbs commented 9 years ago

library(RHive)

Sys.setenv(HIVE_HOME='/home/training/hive/')

Sys.setenv(HADOOP_HOME='/home/training/hadoop-2.4.0/')

rhive.init(hiveHome='/home/training/hive/',hadoopHome='/home/training/hadoop-2.4.0/') rhive.init() rhive.connect('localhost',defaultFS='hdfs://localhost:9000',hiveServer2=FALSE) rhive.env() hadoop home: /home/training/hadoop-2.4.0/ fs: hdfs://localhost:9000 hive home: /home/training/hive/ user name: training user home: /home/training temp dir: /tmp/training

uppercase = function(x){ toupper(x) } rhive.assign('uppercase',uppercase) rhive.export('uppercase')

The above query uppercase.RData is being stored in /rhive/udf/training in filesystem instead of HDFS

ssshow16 commented 9 years ago

Hi!

Add "HADOOP_CONF_DIR" Env. variable as following.

Replace the path "/home/training/hadoop-2.4.0/conf" with

your hadoop configuration path that contain *.xml.

Sys.setenv("HADOOP_CONF_DIR","/home/training/hadoop-2.4.0/conf") rhive.init() ...

Please try it again.

On Thu, Oct 23, 2014 at 3:22 PM, rajasekhariitbbs notifications@github.com wrote:

library(RHive)

Sys.setenv(HIVE_HOME='/home/training/hive/')

Sys.setenv(HADOOP_HOME='/home/training/hadoop-2.4.0/')

rhive.init(hiveHome='/home/training/hive/',hadoopHome='/home/training/hadoop-2.4.0/') rhive.init()

rhive.connect('localhost',defaultFS='hdfs://localhost:9000',hiveServer2=FALSE) rhive.env() hadoop home: /home/training/hadoop-2.4.0/ fs: hdfs://localhost:9000 hive home: /home/training/hive/ user name: training user home: /home/training temp dir: /tmp/training

uppercase = function(x){ toupper(x) } rhive.assign('uppercase',uppercase) rhive.export('uppercase')

The above query uppercase.RData is being stored in /rhive/udf/training in

filesystem instead of HDFS

— Reply to this email directly or view it on GitHub https://github.com/nexr/RHive/issues/70#issuecomment-60197237.

rajasekhariitbbs commented 9 years ago

rhive.env() hadoop home: /home/training/hadoop-2.4.0/ hadoop conf: /home/training/hadoop-2.4.0/etc/hadoop fs: hdfs://localhost:9000 hive home: /home/training/hive/ user name: training user home: /home/training temp dir: /tmp/training

Still the same error, I'm able to query UDF if I manually save the UDF in HDFS /rhive/udf/training,

ssshow16 commented 9 years ago

Fixed bug about loading hadoop configuration.

Please, install again and try it!