RevolutionAnalytics / RHadoop

RHadoop
https://github.com/RevolutionAnalytics/RHadoop/wiki
763 stars 278 forks source link

rmr2 rhadoop - outofmemory exception #233

Open ssetty opened 8 years ago

ssetty commented 8 years ago

Hello, We trying rmr2 package from RHADOOP - get outofmemory exception Below is sample code - aggregate sales quarterly. What is signature of map,reduce & keyval function?

Some documents says "key must be a matrix with a column and the same number of rows as the value." kindly explain

thanks

`

sample input data

time, sales 1,206 2,245 3,185 4,169 5,162 6,177 7,207 8,216 9,193 10,230 11,212

12,192

library(rmr2)

map <- function(key,value) { print("sales are ") print(value) salesLine <- unlist(strsplit(value, "[,]")) month <- salesLine[1] sales <- salesLine[2]

if (month >=1 && month <= 4) month <- 4 else if (month >=5 && month <= 8) month <- 8 else month <- 12

return(keyval(month,sales))

}

reduce <- function(key,values) { keyval(key, sum(values)) }

forecast <- function(input,output=NULL) { mapreduce(input=input,input.format="text",map=map,output=output) }

hdfs.root <- "forecast" hdfs.data <- file.path(hdfs.root,"input") hdfs.out <- file.path(hdfs.root,"output")

out <- forecast(hdfs.data,hdfs.out) `