Closed QuantScientist3 closed 8 years ago
Can you provide the result of running ls -laF on the directory you're using for spark home?
(FYI-often, with Mac installs of spark, the proper SPARK_HOME is actually a subdirectory of wherever you installed Spark. The directory contents will help isolate if that's the issue.)
On Jan 16, 2016, at 4:23 PM, Shlomo. notifications@github.com wrote:
Hi, Working on OSX, I tried every possible combination of setting/unsetting SPARK_HOME (*/sh files) and/or setting sparkhome in interpreterjson
It fails with:
Error in loadNamespace(name): there is no package called 'SparkR'
My stand alone R version works great with the same Spark 15 distribution
What am I doing wrong? BTW, I don't
have an issue with case sensitivity on OSX
Thanks,
— Reply to this email directly or view it on GitHub.
I had spark source instead of spark binary, this was the issue. Thanks. Now I have another issue:
R.version Sys.info() mypkgs <- c("dplyr", "ggplot2", "magrittr","parallel");
Sys.setenv(JAVA_HOME="/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/")
library("rJava")
Sys.setenv(SPARK_HOME="/Users/freebsd/repo/dev/java/rt/spark/") print(Sys.getenv("SPARK_HOME")) .libPaths(c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib"), .libPaths())) print(.libPaths()) Sys.setenv("PATH" = paste(Sys.getenv("PATH"),"/Library/Frameworks/R.framework/Versions/3.2/Resources/bin",file.path(Sys.getenv("SPARK_HOME"), "bin"),sep=":")) Sys.setenv('SPARKR_SUBMIT_ARGS'='"--packages" "com.databricks:spark-csv_2.10:1.3.0" "sparkr-shell"') print(Sys.getenv("PATH")) library(SparkR)
sc <- sparkR.init(master = "local", appName = "SparkR_demo_RTA")
sqlContext <- sparkRSQL.init(sc) df <- createDataFrame(sqlContext, faithful) head(df)
printSchema(df)
df2 <- read.df(sqlContext, "/Users/freebsd/repo/dev/data-sets/mlr.csv", source = "com.databricks.spark.csv", inferSchema = "true")
The com.databricks:spark-csv_2.10:1.3.0 does not load, even though it loads correctly from the Spark shell:
Someone else reported the same thing. It appears to be an issue of the way Zeppelin handles dependencies. The workaround is to load the data frame using the %spark interpreter, give it a temporary table name, then access the temporary table from R by name.
On Jan 16, 2016, at 5:25 PM, Shlomo. notifications@github.com wrote:
I had spark source instead of spark binary, this was the issue. Thanks. Now I have another issue:
print system information
R.version Sys.info() mypkgs <- c("dplyr", "ggplot2", "magrittr","parallel");
install.packages(mypkgs)
Sys.setenv(JAVA_HOME="/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/")
install.packages("rJava")
library("rJava")
Start SparkR
Sys.setenv(SPARK_HOME="/Users/freebsd/repo/dev/java/rt/spark/") print(Sys.getenv("SPARK_HOME")) .libPaths(c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib"), .libPaths())) print(.libPaths()) Sys.setenv("PATH" = paste(Sys.getenv("PATH"),"/Library/Frameworks/R.framework/Versions/3.2/Resources/bin",file.path(Sys.getenv("SPARK_HOME"), "bin"),sep=":")) Sys.setenv('SPARKR_SUBMIT_ARGS'='"--packages" "com.databricks:spark-csv_2.10:1.3.0" "sparkr-shell"') print(Sys.getenv("PATH")) library(SparkR)
sc <- sparkR.init(master = "local", appName = "SparkR_demo_RTA")
sqlContext <- sparkRSQL.init(sc) df <- createDataFrame(sqlContext, faithful) head(df)
Print its schema
printSchema(df)
df2 <- read.df(sqlContext, "/Users/freebsd/repo/dev/data-sets/mlr.csv", source = "com.databricks.spark.csv", inferSchema = "true")
The com.databricks:spark-csv_2.10:1.3.0 does not load, even though it loads correctly from the Spark shell:
— Reply to this email directly or view it on GitHub.
I'm going to mark this closed; please re-open if I'm mistaken. Thanks!
Hi, Is this related to the second issue I reported? The com.databricks:spark-csv_2.10:1.3.0 lib does not load?
thanks,
Hi, Working on OSX, I tried every possible combination of setting/unsetting SPARK_HOME (*/sh files) and/or setting spark.home in interpreter.json.
It fails with:
Error in loadNamespace(name): there is no package called 'SparkR'
My stand alone R version works great with the same Spark 1.5 distribution.
What am I doing wrong? BTW, I don't have an issue with case sensitivity on OSX.
Thanks,