RevolutionAnalytics / rmr2

A package that allows R developer to use Hadoop MapReduce
160 stars 149 forks source link

rmr from.dfs() problem #150

Closed mesutkaya closed 9 years ago

mesutkaya commented 9 years ago

Hi,

We have installed R 3.0.2 version on a 3 node hadoop cluster environment nodes with required installation dependencies. We have succesfully installed rmr2 and rhdfs packages with dependent libraries also. While trying to work around with rmr2 package, just in simple case to check whether everything works fine we have tried:

library(rmr2) from.dfs(to.dfs(1:100)) from.dfs(mapreduce(to.dfs(1:100)))

we get the following error:

hadoop streaming failed with error code 1 Error in if (file.exists(cmd)) return(cmd) : argument is of length zero

It seems that we can not read any input from hdfs by using from.dfs() function. We are using hadoop's 1.2.1 distribution. Can the problem be related to this version or with R's version?

We also checked whether there can be a problem with permissions, but both R and hadoop are installed with the same user.

Any idea is appriciated.

Best,

mesutkaya commented 9 years ago

The problem is resolved. The problem was related with rmr2 3.2.0 version, from.dfs() was not working. We installed rmr2 3.3.0 version to all nodes on the cluster and the problem is resolved.

piccolbo commented 9 years ago

Thanks for the report, I think this is the same as #132 Glad the fix is working.