CODAIT / r4ml

Scalable R for Machine Learning
Apache License 2.0
42 stars 13 forks source link

r4ml.lm() example broken #51

Closed bdwyer2 closed 6 years ago

bdwyer2 commented 7 years ago

The example code in the vignette for r4ml.lm() fails to run with the following error message:

Error: ERROR[sysml.execute]: DML returned error: Error in handleErrors(returnStatus, conn): org.apache.sysml.runtime.DMLRuntimeException: org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error in program block generated from statement block between lines 131 and 136 -- Error evaluating instruction: CP°ba+*°_mVar170·MATRIX·DOUBLE°X·MATRIX·DOUBLE°_mVar171·MATRIX·DOUBLE°48

Sample code:

airlineFiltered <- airline[, c("Month", "DayofMonth", "DayOfWeek", "CRSDepTime",
                               "Distance", "ArrDelay")]

airlineFiltered <- as.r4ml.frame(airlineFiltered)

airlineFiltered <- r4ml.ml.preprocess(
 data = airlineFiltered,
 transformPath = "/tmp",
 recodeAttrs = c("DayOfWeek"),
 omit.na = c("Distance", "ArrDelay"),
 dummycodeAttrs = c("DayOfWeek")
 )

airlineMatrix <- as.r4ml.matrix(airlineFiltered$data)

samples <- r4ml.sample(airlineMatrix, perc=c(0.7, 0.3))
train <- samples[[1]]

outputs <- sysml.execute(
  dml = '
  fileX = "";
  fileY = "";
  X = read($fileX);
  Y = read($fileY);
  nrow_X = nrow(X);
  nrow_Y = nrow(Y);
  stop("nrow_X = " + nrow_X + "   nrow_Y = " + nrow_Y);',
  X = .r4ml.separateXAndY(train, "ArrDelay")$X,
  Y = .r4ml.separateXAndY(train, "ArrDelay")$Y
)
bdwyer2 commented 6 years ago

Fixed by #52