Open shrutimantri opened 8 months ago
Sys.getenv("SPARK_HOME") should normally resolve to /opt/bitnami/spark if you're running this script with a default bitnami/spark image
the default example
id: "r_submit"
type: "io.kestra.plugin.spark.RSubmit"
runner: DOCKER
docker:
networkMode: host
user: root
master: spark://localhost:7077
mainScript: |
library(SparkR, lib.loc = c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib")))
sparkR.session()
print("The SparkR session has initialized successfully.")
sparkR.stop()
from here fails with error:
Exception in thread "main" java.io.IOException: Cannot run program "Rscript": error=2, No such file or directory
Expected Behavior
-
Actual Behaviour
In the R Spark flow example provided here: https://kestra.io/plugins/plugin-spark/tasks/io.kestra.plugin.spark.RSubmit What should be SPARK_HOME set as in the env variable?
This runs in a Docker runner, so its unclear as to what should be set as SPARK_HOME. Once we know how the flow should exactly be, I can make changes in the documentation accordingly.
Steps To Reproduce
N/A
Environment Information
Example flow
Flow as provided here: https://kestra.io/plugins/plugin-spark/tasks/io.kestra.plugin.spark.RSubmit