Open lukemarsden opened 3 years ago
It seems that seldon-core is not correctly configured:
root@aac9e2c0417cad2a:~# kubectl logs -n kf seldon-core-6fdf95f864-nst87 | head -n 20
2021-02-17T08:39:04.674Z INFO controller-runtime.metrics metrics server is starting to listen {"addr": ":8080"}
2021-02-17T08:39:04.696Z INFO controller-runtime.builder Registering a mutating webhook {"GVK": "machinelearning.seldon.io/v1alpha2, Kind=SeldonDeployment", "path": "/mutate-machinelearning-seldon-io-v1alpha2-seldondeployment"}
2021-02-17T08:39:04.696Z INFO controller-runtime.webhook registering webhook {"path": "/mutate-machinelearning-seldon-io-v1alpha2-seldondeployment"}
2021-02-17T08:39:04.696Z INFO controller-runtime.builder Registering a validating webhook {"GVK": "machinelearning.seldon.io/v1alpha2, Kind=SeldonDeployment", "path": "/validate-machinelearning-seldon-io-v1alpha2-seldondeployment"}
2021-02-17T08:39:04.696Z INFO controller-runtime.webhook registering webhook {"path": "/validate-machinelearning-seldon-io-v1alpha2-seldondeployment"}
2021-02-17T08:39:04.696Z INFO controller-runtime.builder Registering a mutating webhook {"GVK": "machinelearning.seldon.io/v1alpha3, Kind=SeldonDeployment", "path": "/mutate-machinelearning-seldon-io-v1alpha3-seldondeployment"}
2021-02-17T08:39:04.696Z INFO controller-runtime.webhook registering webhook {"path": "/mutate-machinelearning-seldon-io-v1alpha3-seldondeployment"}
2021-02-17T08:39:04.696Z INFO controller-runtime.builder Registering a validating webhook {"GVK": "machinelearning.seldon.io/v1alpha3, Kind=SeldonDeployment", "path": "/validate-machinelearning-seldon-io-v1alpha3-seldondeployment"}
2021-02-17T08:39:04.696Z INFO controller-runtime.webhook registering webhook {"path": "/validate-machinelearning-seldon-io-v1alpha3-seldondeployment"}
2021-02-17T08:39:04.696Z INFO controller-runtime.builder Registering a mutating webhook {"GVK": "machinelearning.seldon.io/v1, Kind=SeldonDeployment", "path": "/mutate-machinelearning-seldon-io-v1-seldondeployment"}
2021-02-17T08:39:04.696Z INFO controller-runtime.webhook registering webhook {"path": "/mutate-machinelearning-seldon-io-v1-seldondeployment"}
2021-02-17T08:39:04.697Z INFO controller-runtime.builder Registering a validating webhook {"GVK": "machinelearning.seldon.io/v1, Kind=SeldonDeployment", "path": "/validate-machinelearning-seldon-io-v1-seldondeployment"}
2021-02-17T08:39:04.697Z INFO controller-runtime.webhook registering webhook {"path": "/validate-machinelearning-seldon-io-v1-seldondeployment"}
2021-02-17T08:39:04.697Z INFO setup starting manager
I0217 08:39:04.697745 1 leaderelection.go:242] attempting to acquire leader lease kf/a33bd623.machinelearning.seldon.io...
2021-02-17T08:39:04.707Z INFO controller-runtime.manager starting metrics server {"path": "/metrics"}
E0217 08:39:04.728161 1 leaderelection.go:331] error retrieving resource lock kf/a33bd623.machinelearning.seldon.io: configmaps "a33bd623.machinelearning.seldon.io" is forbidden: User "system:serviceaccount:kf:seldon-core" cannot get resource "configmaps" in API group "" in the namespace "kf"
2021-02-17T08:39:04.802Z INFO controller-runtime.webhook.webhooks starting webhook server
2021-02-17T08:39:04.803Z INFO controller-runtime.certwatcher Updated current TLS certificate
2021-02-17T08:39:04.804Z INFO controller-runtime.webhook serving webhook server {"host": "", "port": 9876}
Following the instructions from here:
kubectl label namespace kf serving.kubeflow.org/inferenceservice=enabled
cat <<EOF | kubectl create -n kf -f -
apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
name: mlflow
spec:
annotations:
seldon.io/executor: "true"
name: wines
predictors:
- componentSpecs:
- spec:
containers:
- name: classifier
livenessProbe:
initialDelaySeconds: 150
failureThreshold: 300
periodSeconds: 10
successThreshold: 1
httpGet:
path: /health
port: http
scheme: HTTP
readinessProbe:
initialDelaySeconds: 150
failureThreshold: 300
periodSeconds: 10
successThreshold: 1
httpGet:
path: /health
port: http
scheme: HTTP
graph:
children: []
implementation: MLFLOW_SERVER
modelUri: s3://mlflow/0/59bf5cb90345488289b4f4c5f702b560/artifacts/model/
envSecretRefName: seldon-init-container-secret
name: ElasticnetWineModel
name: default
replicas: 1
EOF
Following the steps some times I get the SeldonDeploment created:
seldondeployment.machinelearning.seldon.io/mlflow created
but never becomes available.
Must of the times I get the following error:
Error from server (InternalError): error when creating "STDIN": Internal error occurred: failed calling webhook "mseldondeployment.kb.io": Post https://seldon-core.kf.svc:443/mutate-machinelearning-seldon-io-v1-seldondeployment?timeout=30s: no service port 'ƻ' found for service "seldon-core"
goal: be able to publish a model from mlflow into seldon core (running in kubeflow) so that users can easily deploy models that they are tracking/managing in mlflow