canonical / bundle-kubeflow

Charmed Kubeflow
Apache License 2.0
103 stars 50 forks source link

tls: failed to verify certificate: x509: certificate signed by unknown authority" #1095

Open ShrishtiKarkera opened 3 hours ago

ShrishtiKarkera commented 3 hours ago

Bug Description

I'm unable to use kserve inferenceservice using the JupyterLab notebook, when I create an inference client, it throws this error: "inferenceservice.kserve-webhook-server.defaulter\": failed to call webhook: Post \"https://kserve-webhook-server-service.kubeflow.svc:443/mutate-serving-kserve-io-v1beta1-inferenceservice?timeout=10s\": tls: failed to verify certificate: x509: certificate signed by unknown authority"

Inference service client looks like this and my model is stored in minio:

from datetime import datetime
from kserve import KServeClient, constants
from kserve.models import (
    V1beta1InferenceService,
    V1beta1InferenceServiceSpec,
    V1beta1PredictorSpec,
    V1beta1SKLearnSpec
)
from kubernetes import client
import utils

# Get the default target namespace
namespace = "admin"

now = datetime.now()
v = now.strftime("%Y-%m-%d--%H-%M-%S")

name = 'iris-classifier'
kserve_version = 'v1beta1'
api_version = constants.KSERVE_GROUP + '/' + kserve_version

# Create the InferenceService
isvc = V1beta1InferenceService(
    api_version=api_version,
    kind=constants.KSERVE_KIND,
    metadata=client.V1ObjectMeta(
        name=name, 
        namespace=namespace, 
        annotations={'sidecar.istio.io/inject': 'false'}
    ),
    spec=V1beta1InferenceServiceSpec(
        predictor=V1beta1PredictorSpec(
            service_account_name="sa-minio-kserve",
            sklearn=V1beta1SKLearnSpec(
                storage_uri="s3://mlpipeline/models/iris_model.pkl"
            )
        )
    )
)

# Create the InferenceService in KServe
KServe = KServeClient()
KServe.create(isvc)

I checked the certs and found everything to be in place, I also tried restarting the mutatingwebhookconfiguration but didn't help.

To Reproduce

  1. Deploy Charmed Kubeflow - https://charmed-kubeflow.io/docs/get-started-with-charmed-kubeflow
  2. Allow minio access - https://charmed-kubeflow.io/docs/allow-access-minio
  3. Allow Kserve to access minio
  4. Launch a new notebook (scipy image)
  5. Execute the following code Note: I have the model in minio bucket: mlpipeline (upload the model.pkl file)
pip install minio boto3 mlflow
import pandas as pd
import os
from sklearn import datasets
from minio import Minio
# Load dataset
iris = datasets.load_iris()
df = pd.DataFrame(iris.data, columns=iris.feature_names)
df['species'] = iris.target

df = df.dropna()
from sklearn.model_selection import train_test_split
import pandas as pd
import numpy as np
import os
target_column = 'species'
X = df.loc[:, df.columns != target_column]
y = df.loc[:, df.columns == target_column]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3,stratify = y, random_state=47)
from sklearn.linear_model import LogisticRegression
import joblib
iris_model = LogisticRegression(max_iter=200)
iris_model.fit(X_train,y_train)
joblib.dump(iris_model, 'iris_model.pkl')
from datetime import datetime
from kserve import KServeClient, constants
from kserve.models import (
    V1beta1InferenceService,
    V1beta1InferenceServiceSpec,
    V1beta1PredictorSpec,
    V1beta1SKLearnSpec
)
from kubernetes import client
import utils

# Get the default target namespace
namespace = "admin"

now = datetime.now()
v = now.strftime("%Y-%m-%d--%H-%M-%S")

name = 'iris-classifier'
kserve_version = 'v1beta1'
api_version = constants.KSERVE_GROUP + '/' + kserve_version

# Create the InferenceService
isvc = V1beta1InferenceService(
    api_version=api_version,
    kind=constants.KSERVE_KIND,
    metadata=client.V1ObjectMeta(
        name=name, 
        namespace=namespace, 
        annotations={'sidecar.istio.io/inject': 'false'}
    ),
    spec=V1beta1InferenceServiceSpec(
        predictor=V1beta1PredictorSpec(
            service_account_name="sa-minio-kserve",
            sklearn=V1beta1SKLearnSpec(
                storage_uri="s3://mlpipeline/models/iris_model.pkl"
            )
        )
    )
)

# Create the InferenceService in KServe
KServe = KServeClient()
KServe.create(isvc)

Environment

AWS t3x2 large instance with 10 gbs of storage Installed Charmed Kubeflow, minio and mlflow Allowed minio access and mlflow access

Relevant Log Output

Post \"https://kserve-webhook-server-service.kubeflow.svc:443/mutate-serving-kserve-io-v1beta1-inferenceservice?timeout=10s\": tls: failed to verify certificate: x509: certificate signed by unknown authority","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:329\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:227"}
{"level":"error","ts":"2024-09-30T20:50:29Z","msg":"Reconciler error","controller":"inferenceservice","controllerGroup":"serving.kserve.io","controllerKind":"InferenceService","InferenceService":{"name":"iris-classifier","namespace":"admin"},"namespace":"admin","name":"iris-classifier","reconcileID":"57908d65-5f49-4305-ade5-3247160b89ec","error":"Internal error occurred: failed calling webhook \"inferenceservice.kserve-webhook-server.defaulter\": failed to call webhook: Post \"https://kserve-webhook-server-service.kubeflow.svc:443/mutate-serving-kserve-io-v1beta1-inferenceservice?timeout=10s\": tls: failed to verify certificate: x509: certificate signed by unknown authority","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:329\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:227"}

Additional Context

No response

syncronize-issues-to-jira[bot] commented 3 hours ago

Thank you for reporting us your feedback!

The internal ticket has been created: https://warthogs.atlassian.net/browse/KF-6340.

This message was autogenerated