h2oai / sparkling-water

Sparkling Water provides H2O functionality inside Spark cluster
https://docs.h2o.ai/sparkling-water/3.3/latest-stable/doc/index.html
Apache License 2.0
968 stars 360 forks source link

Errors Running RSparkling on Databricks Azure Cluster #3199

Closed exalate-issue-sync[bot] closed 1 year ago

exalate-issue-sync[bot] commented 1 year ago

I tried running RSparkling on Databricks/Azure, but ran into some errors following the docs (step 4)

  1. RCurl dependency is missing in the docs. I wasn't able to install h2o without installing RCurl. I noticed this is done explicitly in the H2O for R documentation.

  2. Unable to start H2O Context: running h2o_context(sc) does not work. According to the error message, the function couldn't be found. I tried with rsparkling::h2o_context(sc), but that didn't work either.

  3. H2OConf() returns Error : java.lang.ClassNotFoundException: ai.h2o.sparkling.H2OConf Error : java.lang.ClassNotFoundException: ai.h2o.sparkling.H2OConf

Looking at the Rsparkling docs, I tried running H2OConf() before running hc <- H2OContext.getOrCreate(h2oConf), but got an error.

exalate-issue-sync[bot] commented 1 year ago

Marek Novotny commented: [~accountid:5e674dade3c3e70d023a3c81] This ticket looks like a duplicate to [SW-2516|https://h2oai.atlassian.net/browse/SW-2516]

RE 3: As a temporary fix, you need to add Sparkling Water jar ( E.g ai.h2o:sparkling-water-package_2.12:3.32.0.3-1-3.0) to cluster libraries manually.

DinukaH2O commented 1 year ago

JIRA Issue Migration Info

Jira Issue: SW-2531 Assignee: UNASSIGNED Reporter: pech State: Open Fix Version: N/A Attachments: N/A Development PRs: N/A

hasithjp commented 1 year ago

JIRA Issue Migration Info Cont'd

Jira Issue Created Date: 2021-02-15T21:18:19.901-0800