Open Wimsen opened 6 months ago
Thanks for reporting & apologies for late reply.
This is a valid concern. Ideally we should not request you to pyfunc.load_model()
instead we should try to get the information directly from model_uri
. Let us look into this.
Registering MLFlow models is currently done by referencing an in-memory pyfunc model. Snippet from the documentation:
A problem with this is that
mlflow.pyfunc.load_model()
requires that the model's dependencies are available in the current python runtime callingregistry.log_model()
. The model's dependencies and the runtime's are probably divergent, and worst-case incompatible with each other.An example of the latter is if your model is trained using
scikit-learn < 1.2.1
. Correct deserialization of the model in the registration runtime is then impossible, assnowflake-ml-python
itself depends onscikit-learn (>=1.2.1,<1.4)
. A workaround is installing and loading the model with a newer version ofscikit-learn
, but this is inadvisible for obvious reasons.Is it possible to make the registration of MLFLow models independent of the registered model's dependencies? Ideally the model registration just uploads the model artifacts to the model registry, and the actual loading and deserialization of the MLFlow model is done at inference-time using the correct dependencies.