Open WillSmisi opened 3 years ago
Do you get an error with v0.9 of xgboost4j / xgboost4j-spark?
Do you get an error with v0.9 of xgboost4j / xgboost4j-spark?
I guess so.I succeed in training xgboost model and uploading model,but failed to load xgboost model.
Do you get an error with v0.9 of xgboost4j / xgboost4j-spark?
Thanks for your reply,have you tried to save model and then load it?
You can save the model as below:
pipe = Pipeline(stages = stages + [xgb])
model = pipe.fit(data)
model.write().overwrite().save(modelpath)
and load it later as :
from pyspark.ml import PipelineModel
model = PipelineModel.load(modelpath)
This worked for me.
You and also directly save and load XGBoostClassifier
or XGBRegressor
since they have JavaWriter
as the parent class
One point to be noted here is that if you are training on a distributed system then you will have to save the model on a distributed storage system like HDFS or Amazon S3
Background
I have a small PySpark program that uses xgboost4j and xgboost4j-spark in order to train a given dataset in a spark dataframe form.
The training and saving is done, but It seems I cannot load the model.
Current libraries versions:
Pyspark 2.4.5
xgboost4j 0.91
xgboost4j-spark 0.91
The main process is as follow:
The error I get:
I am searching for a long time on net.But no use. Please help or try to give some ideas how to achieve this.
thanks in advance.