Open sibyl1956 opened 2 years ago
Hey @sibyl1956 :wave:! Thank you so much for reporting the issue/feature request :rotating_light:. Someone from SynapseML Team will be looking to triage this issue soon. We appreciate your patience.
@svotaw -- could you take a look at this issue ? Thanks !
Can you give more context here? How did you save the model? What was the code to create the original Pipeline?
Having the same issue. Here's the code i used to train and save the model:
from synapse.ml.lightgbm import LightGBMRegressor
from synapse.ml.train import TrainedRegressorModel
from pyspark.ml.pipeline import PipelineModel
model = TrainRegressor(
model=LightGBMRegressor(**model_params),
inputCols=features,
labelCol=target
)
trained_model = model.fit(df_train)
trained_model.getModel().save('trained_model_pipeline')
loaded_model = PipelineModel.load('trained_model_pipeline')
Running that last line gives me the same error as the OP. Running on SynapseML 0.11.1, PySpark 3.2.3.
I can save the TrainedregressorModel
and use TrainedRegressorModel.load
to load the model correctly, but using PipelineModel.load
seems like a more general solution to loading models and I would prefer using that.
Here is an anecdotal experience, whatever it is worth:
I had the same problem and was able to get the pipeline to load by flattening the pipeline stages. It was erroring when my first stage in the pipeline was itself a pipeline of feature transformations. When I removed this nested pipeline structure I was able to load the saved pipeline.
For a pyspark.ml.Pipeline where all stages were java stages (estimators and transformers that come from the spark MLlib library) the model could be saved and read without problems.
WORKS:
pipe = Pipeline(
stages=[
SomePysparkMLibTransformer, # is an instance of the JavaMLWritable
LightGBMClassifier(**model_params),
]
)
The error occurred when one of the transformers were a custom and not a java stage.
DOESN'T WORK:
pipe = Pipeline(
stages=[
SomeCustomTransformer, # is NOT an instance of the JavaMLWritable
LightGBMClassifier(**model_params),
]
)
In this case the PipelineModel.write method returned a non java writer. The classes synapse.ml.lightgbm.LightGBMClassifier and synapse.ml.lightgbm.LightGBMRegressor inherit correct java reader (pyspark.ml.util.JavaMLReadable) and writer (pyspark.ml.util.JavaMLWritable). The problem is with the superclass synapse.ml.core.schema.Utils.ComplexParamsMixin that inherits only from the pyspark.ml.util.MLReadable.
I could bypass the problem by wrapping the estimator with the pyspark.ml.Pipeline. In this situation the write method of the last stage will return the JavaMLWriter not the PipelineModelWriter.
pipe = Pipeline(
stages=[
SomeCustomTransformer, # is NOT an instance of the JavaMLWritable
Pipeline(
stages=[
LightGBMClassifier(**model_params),
]
)
]
)
Is this bug still being considered? Implementing
pipeline = Pipeline(
stages=[
custom_transformer,
PipelineModel(stages=[lgbm_model]),
custom_transformer
]
)
seems like it should just be a temporary work around.
SynapseML version
0.10.1
System information
Language version: Python: 3.8.10, Scala 2.12 Spark Version : Apache Spark 3.2.1, Spark Platform: Databricks
Describe the problem
When try to load a pipeline model for lightgbm, I encountered this error message: 'com.microsoft.azure.synapse.ml.lightgbm' has no attribute 'LightGBMClassificationModel'
But I imported from synapse.ml.lightgbm import LightGBMClassificationModel before I try to load the pipeline model
Code to reproduce issue
Other info / logs
What component(s) does this bug affect?
area/cognitive
: Cognitive projectarea/core
: Core projectarea/deep-learning
: DeepLearning projectarea/lightgbm
: Lightgbm projectarea/opencv
: Opencv projectarea/vw
: VW projectarea/website
: Websitearea/build
: Project build systemarea/notebooks
: Samples under notebooks folderarea/docker
: Docker usagearea/models
: models related issueWhat language(s) does this bug affect?
language/scala
: Scala source codelanguage/python
: Pyspark APIslanguage/r
: R APIslanguage/csharp
: .NET APIslanguage/new
: Proposals for new client languagesWhat integration(s) does this bug affect?
integrations/synapse
: Azure Synapse integrationsintegrations/azureml
: Azure ML integrationsintegrations/databricks
: Databricks integrations