snowflakedb / snowflake-ml-python

Apache License 2.0
38 stars 9 forks source link

Unable to log ml model using a signature from MLFlow #108

Closed lennartvandeguchte closed 1 month ago

lennartvandeguchte commented 2 months ago

I'm unable to log my sklearn model in Snowflake by using a signature instead of a sample input. The model has been registered first in MLFlow and now I want to deploy it in Snowflake. Here is the code I wrote:

model_name = 'chiller_performance'
model_alias = 'production'
model = mlflow.sklearn.load_model(f"models:/{model_name}@{model_alias}")
model_info = mlflow.models.get_model_info(f"models:/{model_name}@{model_alias}")

mv = reg.log_model(model, 
                   model_name="chiller_performance",
                   version_name="v1",
                   conda_dependencies=["scikit-learn==1.3.0"],
                   comment="Chiller Performance Model",
                   signatures=model_info.signature,
                   options={'relax_version': False})

The error that I receive:

AttributeError: (0000) 'ModelSignature' object has no attribute 'keys'

I also have been manually constructing the signature as has been described in the documentation (https://docs.snowflake.com/en/developer-guide/snowpark-ml/model-registry/model-signature), but this leads to the same error.

As the ModelSignature object is not a dictionary, and therfore does not contain 'keys', it results in the above error. Therefore, I also tried to first convert it to a dict by using the following code:

mv = reg.log_model(model, 
                   model_name="chiller_performance",
                   version_name="v1",
                   conda_dependencies=["scikit-learn==1.3.0"],
                   comment="Chiller Performance Model",
                   signatures=ModelSignature.to_dict(model_signature)),
                   options={'relax_version': False})

Where the model_signature is created as follows:

from snowflake.ml.model.model_signature import FeatureSpec, ModelSignature
def convert_mlflow_schema_to_snowpark_feature_spec(mlflow_schema):
    snowpark_feature_spec = []
    for col in mlflow_schema:
        feature_spec = FeatureSpec.from_mlflow_spec(col, col.name)
        snowpark_feature_spec.append(feature_spec)
    return snowpark_feature_spec

# Convert input and output schemas using FeatureSpec.from_mlflow_spec
input_feature_spec = convert_mlflow_schema_to_snowpark_feature_spec(model_info.signature.inputs)
output_feature_spec = convert_mlflow_schema_to_snowpark_feature_spec(model_info.signature.outputs)

# Create Snowpark ModelSignature
model_signature = ModelSignature(
    inputs=input_feature_spec,
    outputs=output_feature_spec
)
model_signature

This leads to the following error:

ValueError: (0000) Target method inputs is not callable or does not exist in the model.

Anyone that can help me here? Or is this a known bug?

sfc-gh-wzhao commented 1 month ago

Hi lennartvandeguchte,

Please check our documentation about log_model (https://docs.snowflake.com/en/developer-guide/snowpark-ml/model-registry/overview#registering-models-and-versions ), the signatures in log_model is a mapping from target method name to signatures of input and output. Thus, it should be signatures={"predict": model_info.signature}. Furthermore, when logging a mlflow model, you don't need to provide signature or sample input data. Signatures will be automatically inferred from the MLFlow model.

lennartvandeguchte commented 1 month ago

Hi sfc-gh-wzhao,

Thank you for your help, the format was not entirely clear for me from the docs but I now got it working. I was actually using a sklearn model so I had to first convert the mlflow signature to a Snowflake model signature.

For anyone interested, I created the following functions to do this:


from snowflake.ml.model.model_signature import FeatureSpec, ModelSignature
def create_snowflake_feature_spec(mlflow_schema):
    snowpark_feature_spec = []
    for col in mlflow_schema:
        feature_spec = FeatureSpec.from_mlflow_spec(col, col.name)
        snowpark_feature_spec.append(feature_spec)
    return snowpark_feature_spec

def convert_mlflow_signature_to_snowpark_signature(model_info):
    # Convert input and output schemas using FeatureSpec.from_mlflow_spec
    input_feature_spec = create_snowflake_feature_spec(model_info.signature.inputs)
    output_feature_spec = create_snowflake_feature_spec(model_info.signature.outputs)

    # Create Snowpark ModelSignature
    model_signature = ModelSignature(
        inputs=input_feature_spec,
        outputs=output_feature_spec)

    return model_signature```