oracle / accelerated-data-science

ADS is the Oracle Data Science Cloud Service's python SDK supporting, model ops (train/eval/deploy), along with running workloads on Jobs and Pipeline resources.
https://accelerated-data-science.readthedocs.io/
Universal Permissive License v1.0
87 stars 44 forks source link

[Bug]: Validation error in GenericModel.save when the python version is >= 3.10 #760

Closed mayoor closed 5 months ago

mayoor commented 6 months ago

Oracle-ads version used

Description

When user ties to save the model with inference conda environment which has python version 3.10 or greater, the user gets introspection error - In runtime.yaml, the key MODEL_DEPLOYMENT.INFERENCE_PYTHON_VERSION must be set to a value of 3.6 or higher.

How to Reproduce

import pandas as pd
import numpy as np
from sklearn.datasets import fetch_openml
from sklearn.model_selection import train_test_split

import ads
import automlx as automl
from automlx import init
from ads.model import GenericModel
from ads.common.model_metadata import UseCaseType

dataset = fetch_openml(name='adult', as_frame=True)
df, y = dataset.data, dataset.target

# Several of the columns are incorrectly labeled as category type in the original dataset
numeric_columns = ['age', 'capitalgain', 'capitalloss', 'hoursperweek']
for col in df.columns:
    if col in numeric_columns:
        df[col] = df[col].astype(int)

X_train, X_test, y_train, y_test = train_test_split(df,
                                                    y.map({'>50K': 1, '<=50K': 0}).astype(int),
                                                    train_size=0.7,
                                                    random_state=0)

init(engine='local')
est = automl.Pipeline(task='classification')
est.fit(X_train, y_train)

ads.set_auth("resource_principal")
automl_model = GenericModel(estimator=est, artifact_dir="automl_model_artifact")

automl_model.prepare(inference_conda_env="automlx234_p310_cpu_x86_64_v1",
                     training_conda_env="automlx234_p310_cpu_x86_64_v1",
                     use_case_type=UseCaseType.BINARY_CLASSIFICATION,
                     X_sample=X_test,
                     force_overwrite=True)

automl_model.introspect()

What was Observed

image

What was Expected

No validation error

Version

'2.9.0'
mayoor commented 6 months ago

Place to fix - https://github.com/oracle/accelerated-data-science/blob/main/ads/model/model_artifact_boilerplate/artifact_introspection_test/model_artifact_validate.py#L32

mayoor commented 6 months ago

To be closed once the changes are released to pypi