RevolutionAnalytics / rhdfs

A package that allows R developers to use Hadoop HDFS
64 stars 73 forks source link

Getting error while loading rhdfs library in R #1

Closed canil closed 11 years ago

canil commented 11 years ago

hi, I am getting error while loading rhdfs in R. I added succesfully library(rJava).

And I cannot solve it, Could someone help ? Thanks in advance

library("rhdfs")
Error : .onLoad failed in loadNamespace() for 'rhdfs', details:
call: fun(libname, pkgname)
error: Environment variable HADOOP_CMD must be set before loading package rhdfs
Error: package/namespace load failed for ‘rhdfs’

I tried to reconf rJava. But it did not change anything. canil@ubuntu:/$ sudo R CMD javareconf Java interpreter : /usr/bin/java Java version : 1.7.0_13 Java home path : /usr/lib/jvm/java-7-oracle/jre Java compiler : /usr/bin/javac Java headers gen.: /usr/bin/javah Java archive tool: /usr/bin/jar NOTE: Your JVM has a bogus java.library.path system property! Trying a heuristic via sun.boot.library.path to find jvm library... Java library path: $(JAVA_HOME)/lib/amd64:$(JAVA_HOME)/lib/amd64/server JNI linker flags : -L$(JAVA_HOME)/lib/amd64 -L$(JAVA_HOME)/lib/amd64/server -ljvm JNI cpp flags : -I$(JAVA_HOME)/../include -I$(JAVA_HOME)/../include/linux

Updating Java configuration in /etc/R Done.

And I did already set the HADOOP_CMD which is my path tho hadoop. such as ,

Export HADOOP_CMD=/usr/local/hadoop/bin/hadoop

What should I do ?

piccolbo commented 11 years ago

Thanks for filing your issue again here. I am not sure but you clearly set the variable and R clearly can't see it. So are there different accounts in play here? Are you sudo-ing any of those commands (sudo at defaults doesn't honor exports)? Did you try a Sys.getenv("HADOOP_CMD") just before or after your attempt to load rhdfs? Thanks

canil commented 11 years ago

Hi , I did not get your question clearly but, yes there different users here. my hadoop cluster works in hduser, and admin is canil. And I run the commands in R , here is the result!

Sys.getenv("HADOOP_CMD") [1] "" library(rhdfs) Loading required package: rJava Error : .onLoad failed in loadNamespace() for 'rhdfs', details: call: fun(libname, pkgname) error: Environment variable HADOOP_CMD must be set before loading package rhdfs Error: package/namespace load failed for ‘rhdfs’ Sys.getenv("HADOOP_CMD") [1] ""

So that means couldn't I set the hadoop_cmd correctly ?

canil commented 11 years ago

And I did the installation of the rhdfs like that:

canil@ubuntu:/$ sudo -E R CMD INSTALL /home/canil/Downloads/rhdfs_1.0.5.tar.gz

and it is worked.

So what should I do ? Thanks

piccolbo commented 11 years ago

Yes you did not set it where it matters. Please enter this one after the other in a terminal

export HADOOP_CMD=<your path here>
R
Sys.getenv("HADOOP_CMD")
canil commented 11 years ago

So you mean, typing the commands consecutively. if so ? I did it.

canil@ubuntu:~$ export HADOOP_CMD= canil@ubuntu:~$ R

R version 2.15.2 (2012-10-26) -- "Trick or Treat" Copyright (C) 2012 The R Foundation for Statistical Computing ISBN 3-900051-07-0 Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details.

Natural language support but running in an English locale

R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R.

[Previously saved workspace restored]

Sys.getenv("HADOOP_CMD") [1] ""

piccolbo commented 11 years ago

I made a markup mistake, please reread my previous comment and try again. Sorry about that

canil commented 11 years ago

By the way , I did it like that. and I think it worked.

library(rhdfs) Loading required package: rJava Error : .onLoad failed in loadNamespace() for 'rhdfs', details: call: fun(libname, pkgname) error: Environment variable HADOOP_CMD must be set before loading package rhdfs Error: package/namespace load failed for ‘rhdfs’ Sys.setenv("HADOOP_CMD"="/usr/local/hadoop/bin/hadoop") library(rhdfs)

HADOOP_CMD=/usr/local/hadoop/bin/hadoop

Be sure to run hdfs.init()

hdfs.init()

canil commented 11 years ago

after I run the hdfs.init() , no error nothing. So that means everything is working right ? and another issue is I have set every time Hadoop_CMD manually before I use to rhfs. right ?

piccolbo commented 11 years ago

Please consult you shell manual. This is off topic for this issue tracker. I consider this issue resolved.

canil commented 11 years ago

Yes I think so. Thank you very much.