iterative / mlem

🐶 A tool to package, serve, and deploy any ML model on any platform. Archived to be resurrected one day🤞
https://mlem.ai
Apache License 2.0
718 stars 44 forks source link

Serializers type #612

Open denis-akvelon opened 1 year ago

denis-akvelon commented 1 year ago

During our work, we have encountered misunderstandings with the serializers type that we used from your schema (interface.json). It works fine when we work with "dataframe" type, but when we try to obtain other types, it does not work. For example, we used pandas_scikit.py to generate a model with series type, but only "dataframe" was built. We have also encountered the same problem using LightGbm model. We would be grateful if you could provide a detailed description of how to avoid these problems or send us any information to help.

Example:

from mlem.api import save
import pandas as pd
from sklearn.linear_model import LogisticRegression

url = 'http://bit.ly/kaggletrain'
train = pd.read_csv(url)
train.head()

feature_cols = ['Pclass', 'Parch']
X = train.loc[:, feature_cols]

y = train.Survived

logreg = LogisticRegression()
logreg.fit(X, y)

save(
    logreg,
    sample_data=X,
    path="series_logreg",
)
aguschin commented 1 year ago

Hi @denis-akvelon! Thanks for reporting! Could you please explain, what usecase is blocked by this? Thanks!

aleksandr-dudko commented 1 year ago

Hi @aguschin ! We don't know how to obtain another data_type, not 'dataframe'. For example, SeriesType, LightGBMDataType and TorchTensorDataType. Is it possible? If yes, could you please, explain how to do that? Thank you! :)