Add support for model signature and examples in dataset MlflowModelTrackingDataset
Context
MLflows support the addition of signature and examples in the models, which includes useful information in the model artifact view
Possible Implementation
At the moment I am passing the signature as a dictionary in the save_args, and using an aditional omegaconf resolver where I am transforming the dictionary to a ModelSignature object. Something similar could be achieved inside MlflowModelTrackingDataset.
import json
from mlflow.models.signature import ModelSignature
from omegaconf import DictConfig, OmegaConf
def create_model_signature(model_signature: DictConfig) -> ModelSignature:
signature_dict = OmegaConf.to_container(model_signature)
json_signature = {}
for key, value in signature_dict.items():
if value is None:
json_signature[key] = None
else:
json_signature[key] = json.dumps(value)
return ModelSignature.from_dict(json_signature)
Another alternative could be using mlflow infer_signature, however, I am not sure how can you pass the object to infer the schema from. The same thing happens with the example, which according to the documentation could be any of pandas.core.frame.DataFrame, numpy.ndarray, dict, list, csr_matrix, csc_matrix, str, bytes, tuple. At the moment I can pass just a dict or list in the catalog yml.
Description
Add support for model signature and examples in dataset
MlflowModelTrackingDataset
Context
MLflows support the addition of signature and examples in the models, which includes useful information in the model artifact view
Possible Implementation
At the moment I am passing the signature as a dictionary in the save_args, and using an aditional omegaconf resolver where I am transforming the dictionary to a
ModelSignature
object. Something similar could be achieved insideMlflowModelTrackingDataset
.Another alternative could be using mlflow
infer_signature
, however, I am not sure how can you pass the object to infer the schema from. The same thing happens with the example, which according to the documentation could be any ofpandas.core.frame.DataFrame, numpy.ndarray, dict, list, csr_matrix, csc_matrix, str, bytes, tuple
. At the moment I can pass just a dict or list in the catalog yml.