rstudio / sparkxgb

R interface for XGBoost on Spark
https://spark.posit.co/packages/sparkxgb/
Other
47 stars 14 forks source link

xgboost Java class not found #12

Closed The-Dub closed 5 years ago

The-Dub commented 5 years ago

Hi,

When running the basic iris example (on the mainpage) - or any code, local or Databricks, I get the following error: Error: java.lang.ClassNotFoundException: ml.dmlc.xgboost4j.scala.spark.XGBoostClassifier

I've tried to run a regression, main issue with Error: java.lang.ClassNotFoundException: ml.dmlc.xgboost4j.scala.spark.XGBoostRegressor

But in the java/main.scala file, it seems that only the classifier is imported - but not the classifier.

Any ideas? thank you

Session info

R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 17134)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
[1] xgboost_0.81.0.1     sparkxgb_0.0.9001    sparklyr_0.9.9013    DBI_1.0.0           
[5] lubridate_1.7.4      dplyr_0.7.8          RevoUtils_11.0.1     RevoUtilsMath_11.0.0

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.0        compiler_3.5.1    pillar_1.3.1      later_0.7.5       dbplyr_1.3.0     
 [6] bindr_0.1.1       r2d3_0.2.3        base64enc_0.1-3   tools_3.5.1       digest_0.6.18    
[11] lattice_0.20-35   jsonlite_1.5      tibble_2.0.1      pkgconfig_2.0.2   rlang_0.3.1      
[16] Matrix_1.2-14     shiny_1.2.0       rstudioapi_0.9.0  parallel_3.5.1    yaml_2.2.0       
[21] bindrcpp_0.2.2    withr_2.1.2       stringr_1.3.1     httr_1.4.0        askpass_1.1      
[26] rappdirs_0.3.1    generics_0.0.2    htmlwidgets_1.3   grid_3.5.1        rprojroot_1.3-2  
[31] tidyselect_0.2.5  data.table_1.12.0 glue_1.3.0        forge_0.1.9005    R6_2.3.0         
[36] purrr_0.3.0       magrittr_1.5      backports_1.1.3   promises_1.0.1    fortunes_1.5-4   
[41] htmltools_0.3.6   ellipsis_0.1.0    assertthat_0.2.0  mime_0.6          xtable_1.8-3     
[46] httpuv_1.4.5.1    config_0.3        stringi_1.2.4     openssl_1.2.1     crayon_1.3.4 
kevinykuo commented 5 years ago

Databricks currently doesn't support extensions, but @falaki is currently looking into it.

The-Dub commented 5 years ago

Thank you for the quick reply

This also doesn’t work on my local machine, any ideas why?

kevinykuo commented 5 years ago

With the same error?

The-Dub commented 5 years ago

Yes, same error. The session info is from my desktop

kevinykuo commented 5 years ago

Could you try connecting with sc <- sparklyr::spark_connect(master = "local", config = list(sparklyr.log.console = TRUE)) to see if the xgboost4j-spark dependency is getting added?

The-Dub commented 5 years ago

it seems that it's "working". Possibly because I didn't refresh the spark connection... oops

On another hand, i'm now getting the same error as https://github.com/rstudio/sparkxgb/issues/4 Error: ml.dmlc.xgboost4j.java.XGBoostError: XGBoostModel training failed

kevinykuo commented 5 years ago

OK thanks for the report, I think this is happening more consistently on windows. Will investigate....

Closing to track in #4