SeldonIO / seldon-core

An MLOps framework to package, deploy, monitor and manage thousands of production machine learning models
https://www.seldon.io/tech/products/core/
Other
4.33k stars 827 forks source link

Deleting failed model doesn't work in v2 #4768

Open nadworny opened 1 year ago

nadworny commented 1 year ago

Describe the bug

I have a model that failed deployment. Kubectl delete is successful but the model is still visible under Models CRDs in k8s.

To reproduce

  1. Train a model and log into mlflow.
  2. Deploy
    apiVersion: mlops.seldon.io/v1alpha1
    kind: Model
    metadata:
    name: xxx
    spec:
    storageUri: "azureblob://xxx"
    secretName: "xxx-secret"
    requirements:
    - mlflow
  3. Delete kubectl delete -f ./k8s/model_xxx.yaml -n ${NAMESPACE}

Expected behaviour

Model is completely deleted.

Environment

ukclivecox commented 1 year ago

Which version of Seldon Core V2 are you running as Models that fail scheduling should still be deletable in latest version.

nadworny commented 1 year ago

@cliveseldon how can I verify that? I installed it today using the helm charts.

nadworny commented 1 year ago

sorry, my bad! This works on AKS (I didn't switch ctx in k9s) but it doesn't in local k8s (docker desktop). I'll update the environment info shortly.

ukclivecox commented 1 year ago

Can you make sure you have latest images locally as you are running on AKS as not sure what would be the difference.

nadworny commented 1 year ago

I did the installation using helm on Wednesday so I'm pretty sure it's latest.

Kolajik commented 9 months ago

This does sound like https://github.com/SeldonIO/seldon-core/issues/5043 issue