snowflakedb / snowflake-ml-python

Apache License 2.0
39 stars 9 forks source link

Out of storage error when logging custom model to model registry #106

Open Mugl3 opened 3 months ago

Mugl3 commented 3 months ago

I am following the guide to load a finetuned embeddings model into the registry using Sentence-Transformers.

I have successfully vectorized the code such that the model can be registered if needed. When I get to the stage of actually uploading the local files to the Snowflake account I receive an error:

raise me.original_exception from None
    [389](https://file+.vscode-resource.vscode-cdn.net/home//GitRepos/snowconnect/sfguide-deploying-custom-models-snowflake-model-registry-main/~/GitRepos/snowconnect/.venv/lib/python3.8/site-packages/snowflake/ml/_internal/telemetry.py:389)     else:
--> [390](https://file+.vscode-resource.vscode-cdn.net/home//GitRepos/snowconnect/sfguide-deploying-custom-models-snowflake-model-registry-main/~/GitRepos/snowconnect/.venv/lib/python3.8/site-packages/snowflake/ml/_internal/telemetry.py:390)         raise me.original_exception from e
    [391](https://file+.vscode-resource.vscode-cdn.net/home//GitRepos/snowconnect/sfguide-deploying-custom-models-snowflake-model-registry-main/~/GitRepos/snowconnect/.venv/lib/python3.8/site-packages/snowflake/ml/_internal/telemetry.py:391) else:
    [392](https://file+.vscode-resource.vscode-cdn.net/home//GitRepos/snowconnect/sfguide-deploying-custom-models-snowflake-model-registry-main/~/GitRepos/snowconnect/.venv/lib/python3.8/site-packages/snowflake/ml/_internal/telemetry.py:392)     return update_stmt_params_if_snowpark_df(res, statement_params)

SnowparkSQLException: (1300) (1304): 100357 (P0000): Python Interpreter Error:
OSError: [Errno 28] No space left on device in function CreateModule-618f28ef-5e60-49b2-8dfe-0e65d2f5c680 with handler predict.infer

The device I am operating on has over 800gb available. The model is only 265 MBs.

The below is the SQL command showing in Snowflake UI that failed CREATE MODEL MRCM_HOL_DB.MRCM_HOL_SCHEMA.MY_STCUSTOM_MODEL WITH VERSION VERSION_1 FROM @MRCM_HOL_DB.MRCM_HOL_SCHEMA.SNOWPARK_TEMP_STAGE_IZQPCZQXTY/model Query ID: 01b56116-3202-ab08-0000-0001d77282d9

This was running on venv py3.8 with Snowflake trial account for testing. Please let me know if you need more logs.

sfc-gh-wzhao commented 3 months ago

Hi Mugl3,

Thank you for reporting. The error message is actually reporting the storage in the Snowflake Virtual Warehouse is running out disk space when loading the model. You could switch to use Snowpark Optimized Warehouse which has a larger disk to load your model. We are also working on this to make it possible to load a larger model in Standard Warehouse in the next few releases.

Mugl3 commented 3 months ago

Hi Wzhao,

Are you able to confirm the max disk space by warehouse type and size? I have tried an XL Snowpark WH with the same error as before. It's only a small model that easily fits in RAM.

Thanks

sfc-gh-wzhao commented 3 months ago

Hi Mugl3,

Typically standard warehouse has several hundreds MBs temporary storage and Snowpark Optimized Warehouse has multiple GBs temporary storage. The disk space issue is happening when loading the model to the temporary storage and then loading into the RAM of the warehouse. As I mentioned above, we have noticed this issue and are currently working to loading it directly without using the temporary storage.

For your report on "I have tried an XL Snowpark WH with the same error as before. It's only a small model that easily fits in RAM.", could you please share the query ID, so that we could further investigate. Thank you!

Mugl3 commented 3 months ago

When I went to get the query I found out the sample code replaced the Snowpark warehouse with a std one. Have updated the code and it no longer fails with the same error.

I am trying to deploy a base sentence-transformers embedding model. I have included my code to ease troubleshooting:

from snowflake.snowpark import Session
#Default SF connection details defined in ~/.config/snowflake/connection.toml
from transformers import AutoTokenizer, AutoModel
import torch
from snowflake.snowpark import Session
from snowflake.snowpark.version import VERSION
from snowflake.ml.registry import Registry
from snowflake.ml.model import custom_model
from snowflake.ml.model import model_signature

import pandas as pd
import json
import os
import shutil

# warning suppresion
import warnings; warnings.simplefilter('ignore')

#Not attaching the connection code for privacy reasons

#Load a SF model
from sentence_transformers import SentenceTransformer
modelPath = os.getcwd() + '/msmarco_distilbertv4/'

model = SentenceTransformer('msmarco-distilbert-base-v4')
model.save(modelPath)
model = SentenceTransformer(modelPath)

Now define model class

#Generate custom model class
class ST_Custom_Model(custom_model.CustomModel):
    # The init function is used to load the model file
    def __init__(self, context: custom_model.ModelContext) -> None:
        super().__init__(context)
        self.model = SentenceTransformer(modelPath)
        self.model.memory='/tmp/'

    @custom_model.inference_api    
    def predict(self, sentences_df: pd.DataFrame) -> pd.DataFrame:
        data={"id":[],
              "embeddings":[]}
        counter=0
        for i,j in sentences_df.iterrows():
            data["id"].append(counter)
            data['embeddings'].append(model.encode(j['string_column'])[0])
            print(j['string_column'])
            print('printed(j[string_column])')
            counter+=1
        res_df = pd.DataFrame.from_dict(data=data,orient='columns')
        try:
            res_df.set_index('id', inplace=True)
        except:
            print('Could not set index as id')
            pass
        print(res_df)
        return res_df

Create custom model context

stcustom_mc = custom_model.ModelContext(
    models={ # This should be for models/objects that is supported by Model Registry OOTB.
    },
    artifacts={ # Everything not supported needs to be here
        'model_file': 'msmarco_distilbertv4/vocab.txt', 'model_file1': 'msmarco_distilbertv4/config_sentence_transformers.json', 'model_file2': 'msmarco_distilbertv4/README.md', 'model_file3': 'msmarco_distilbertv4/tokenizer.json', 'model_file4': 'msmarco_distilbertv4/1_Pooling', 'model_file5': 'msmarco_distilbertv4/config.json', 'model_file6': 'msmarco_distilbertv4/modules.json', 'model_file7': 'msmarco_distilbertv4/sentence_bert_config.json', 'model_file8': 'msmarco_distilbertv4/tokenizer_config.json', 'model_file9': 'msmarco_distilbertv4/model.safetensors', 'model_file10': 'msmarco_distilbertv4/special_tokens_map.json'
    }
)

Test the model loads & outputs

#Generate dataframe with id and string column
data_dict = {
    "id": [1, 2, 3],
    "string_column": [["Hello",], ["World",], ["Python",]]
}

df = pd.DataFrame(data_dict)

print(df)
my_stcustom_model = ST_Custom_Model(stcustom_mc)
output_pd = my_stcustom_model.predict(df)
output_pd

Infer signature

list_sentence = ["Hello World", "Python is cool", "Snowflake is awesome"]
predict_sign = model_signature.infer_signature(input_data=list_sentence, output_data=output_pd)
predict_sign

Log the model

# Create a model registry connection using the Snowpark session object, we will use the current database and schema for storing the model.
snowml_registry = Registry(session)

custom_mv = snowml_registry.log_model(
    my_stcustom_model,
    model_name="my_stcustom_model",
    version_name="version_1",
    conda_dependencies=["sentence-transformers"],
    pip_requirements=None,
    options={"relax_version": False},
    signatures={"predict": predict_sign},
    comment = 'PyCaret ClassificationExperiment using the CustomModel API'
)

When logging the model I receive this error, SnowparkSQLException: (1300) (1304): 100357 (P0000): Python Interpreter Error: ModuleNotFoundError: No module named 'sentence_transformers.model_card' in function CreateModule-84e75585-09b0-45aa-871d-c5d32fed8f62 with handler predict.infer

Query ID Query - 01b56924-3202-ab2e-0001-d77200019056

Is it an issue with how I define conda dependencies in last code block? I also tried it like this conda_dependencies=["sentence-transformers", "numpy", "pandas"],

sfc-gh-wzhao commented 2 months ago

Hi Mugl3,

Sorry for the late reply.

I believe the issue is because the version is not aligned in your local environment vs Snowflake environment. You could specify the version you want when specifying conda dependencies like conda_dependencies=["sentence-transformers==2.2.2"],.

Also our library recently adds out-of-box support for sentence transformer models recently, thus you could try log the model directly without using the custom model.