googleapis / python-aiplatform

A Python SDK for Vertex AI, a fully managed, end-to-end platform for data science and machine learning.
Apache License 2.0
653 stars 348 forks source link

Allow save/load and log_model for different frameworks/flavors, such as PyTorch, Spacy, custom solution models #2033

Open krstp opened 1 year ago

krstp commented 1 year ago

Currently Vertex AI allows for model save/load for particular set of frameworks, such as:

Union["sklearn.base.BaseEstimator", "xgb.Booster", "tf.Module"]

which constrains usage of the Model Registry by quite a bit. The same applies to aiplatform.log_model() method.

I see the constrain coming from the AutoML platform, however, it would benefit the community not to restricts Model Tracking and Registry to purely AutoML supported frameworks. As such I would like to extend my ask to allow other model frameworks registry, such as PyTorch, SpaCy, or custom solution models that do not necessarily fit any of the popular frameworks.

Extension to other frameworks would allow for more flexibility around VertexAI Tracking and Registry and not limit VAI purely to AutoML solution based tracking and registry.

Current workaround: One can feed a dummy model, such as: TfidfVectorizer() in place of model, which becomes then:

aiplatform.log_model(
    model=dummy_model,
    uri=URI,
    display_name=DISPLAY_NAME,
 )

However, it is pretty rough for future registry or proper tracking; one needs to be aware what is saved. In my case, the key here is URI, but in terms of artifact model it won't make much sense, however, it will allow for log_modelrecord and future core model retrieval from URI.

jasonbrancazio commented 6 months ago

I agree that this would be a useful feature. Furthermore, the Model Registry could have separate fields for the model artifact and the serving container, and could allow serving_container = None.

jasonbrancazio commented 6 months ago

The following code should at least let you log an arbitrary model as an Artifact in an ExperimentRun.

Logging to an experiment is sometimes more useful to me than versioning a model in the Model Registry, because experiment artifacts are in the same tab set as the metrics that I log in the Experiment Run UI.

You just should ensure that your experiment run names (experiment_version here) are unique in order to not associate the model with the wrong run.

This example is for a PyTorch state dict previously saved to Cloud Storage.

from google.cloud.aiplatform import Artifact
from google.cloud.aiplatform.metadata.context import Context
import torch

def log_model_to_experiment(gcs_uri: str, experiment_version: str)
    for c in Context.list():
        if c.schema_title == 'system.ExperimentRun' and c.display_name == experiment_version:
            a = Artifact.create(
                'system.Model',
                uri=gcs_uri,
                resource_id=f'{experiment_version}-model-{str(hex(int(time())))[2:]}', # must_be_globally_unique
                display_name=f'{experiment_version}-model',
                schema_version='0.0.1',
                metadata={"framework": "PyTorch", "framework_version": torch.__version__, "payload_format": "state_dict"}
            )
            c.add_artifacts_and_executions(artifact_resource_names=[a.resource_name])