sungsoo / sungsoo.github.io

Sung-Soo Kim's Blog
30 stars 8 forks source link

feat: KServe Inference #21

Open sungsoo opened 2 years ago

sungsoo commented 2 years ago

KServer Inference

First InferenceService

Run your first InferenceService

In this tutorial, you will deploy a ScikitLearn InferenceService.

This inference service loads a simple iris ML model, send a list of attributes and print the prediction for the class of iris plant."

Since your model is being deployed as an InferenceService, not a raw Kubernetes Service, you just need to provide the trained model and it gets some super powers out of the box ๐Ÿš€.

1. Create test InferenceService

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "sklearn-iris"
spec:
  predictor:
    sklearn:
      storageUri: "gs://kfserving-samples/models/sklearn/iris"

Once you've created your YAML file (named something like "sklearn.yaml"):

kubectl create namespace kserve-test
kubectl apply -f sklearn.yaml -n kserve-test

You can verify the deployment of this inference service as follows.

(base) โ•ญโ”€sungsoo@z840 ~
โ•ฐโ”€$ k get pods -A -w
NAMESPACE                       NAME                                               READY   STATUS    RESTARTS   AGE
...์ค‘๊ฐ„ ์ƒ๋žต
kserve-test                     sklearn-iris-predictor-default-00001-deployment-7958c8bfddv68k9   0/2     Pending   0          2s
kserve-test                     sklearn-iris-predictor-default-00001-deployment-7958c8bfddv68k9   0/2     Pending   0          3s
kserve-test                     sklearn-iris-predictor-default-00001-deployment-7958c8bfddv68k9   0/2     Init:0/1   0          3s
kserve-test                     sklearn-iris-predictor-default-00001-deployment-7958c8bfddv68k9   0/2     Init:0/1   0          8s
kserve-test                     sklearn-iris-predictor-default-00001-deployment-7958c8bfddv68k9   0/2     Init:0/1   0          41s
kserve-test                     sklearn-iris-predictor-default-00001-deployment-7958c8bfddv68k9   0/2     PodInitializing   0          51s
kserve-test                     sklearn-iris-predictor-default-00001-deployment-7958c8bfddv68k9   1/2     Running           0          97s
kserve-test                     sklearn-iris-predictor-default-00001-deployment-7958c8bfddv68k9   2/2     Running           0          98s

2. Check InferenceService status.

kubectl get inferenceservices sklearn-iris -n kserve-test
NAME           URL                                                 READY   PREV   LATEST   PREVROLLEDOUTREVISION   LATESTREADYREVISION                    AGE
sklearn-iris   http://sklearn-iris.kserve-test.example.com         True           100                              sklearn-iris-predictor-default-47q2g   7d23h

If your DNS contains example.com please consult your admin for configuring DNS or using custom domain.

3. Determine the ingress IP and ports

Execute the following command to determine if your kubernetes cluster is running in an environment that supports external load balancers

$ kubectl get svc istio-ingressgateway -n istio-system
NAME                   TYPE           CLUSTER-IP       EXTERNAL-IP      PORT(S)   AGE
istio-ingressgateway   LoadBalancer   172.21.109.129   130.211.10.121   ...       17h

or @microk8s with kubeflow

(base) โ•ญโ”€sungsoo@sungsoo-HP-Z840 ~
โ•ฐโ”€$ kubectl get svc istio-ingressgateway -n kubeflow
NAME                   TYPE           CLUSTER-IP       EXTERNAL-IP    PORT(S)                                                                                                                                                                   AGE
istio-ingressgateway   LoadBalancer   10.152.183.116   10.64.140.43   15020:32267/TCP,80:32425/TCP,443:31890/TCP,15029:31587/TCP,15030:31591/TCP,15031:32223/TCP,15032:32596/TCP,15443:32307/TCP,15011:32504/TCP,8060:32176/TCP,853:30715/TCP   12h

Load Balancer

If the EXTERNAL-IP value is set, your environment has an external load balancer that you can use for the ingress gateway.

export INGRESS_HOST=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
export INGRESS_PORT=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.spec.ports[?(@.name=="http2")].port}')

or @microk8s with kubeflow

export INGRESS_HOST=$(kubectl -n kubeflow get service istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
export INGRESS_PORT=$(kubectl -n kubeflow get service istio-ingressgateway -o jsonpath='{.spec.ports[?(@.name=="http2")].port}')

Node Port

If the EXTERNAL-IP value is none (or perpetually pending), your environment does not provide an external load balancer for the ingress gateway. In this case, you can access the gateway using the serviceโ€™s node port.

# GKE
export INGRESS_HOST=worker-node-address
# Minikube
export INGRESS_HOST=$(minikube ip)
# Other environment(On Prem)
export INGRESS_HOST=$(kubectl get po -l istio=ingressgateway -n istio-system -o jsonpath='{.items[0].status.hostIP}')
export INGRESS_PORT=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.spec.ports[?(@.name=="http2")].nodePort}')

Port Forward

Alternatively you can do Port Forward for testing purpose

INGRESS_GATEWAY_SERVICE=$(kubectl get svc --namespace istio-system --selector="app=istio-ingressgateway" --output jsonpath='{.items[0].metadata.name}')
kubectl port-forward --namespace istio-system svc/${INGRESS_GATEWAY_SERVICE} 8080:80
# start another terminal
export INGRESS_HOST=localhost
export INGRESS_PORT=8080

4. Curl the InferenceService

First prepare your inference input request

{
  "instances": [
    [6.8,  2.8,  4.8,  1.4],
    [6.0,  3.4,  4.5,  1.6]
  ]
}

Once you've created your json test input file (named something like "iris-input.json"):

Real DNS

If you have configured the DNS, you can directly curl the InferenceService with the URL obtained from the status print. e.g

์ด ๋ถ€๋ถ„์—์„œ ์˜ค๋ฅ˜๊ฐ€ ์ƒ๊ธด๋‹ค. DNS ๋ฌธ์ œ์ธ ๋“ฏ...

์‚ดํŽด๋ณด์ž!

curl -v http://sklearn-iris.kserve-test.${CUSTOM_DOMAIN}/v1/models/sklearn-iris:predict -d @./iris-input.json

curl -v http://sklearn-iris.kserve-test.example.com/v1/models/sklearn-iris:predict -d @./iris-input.json

Magic DNS

If you don't want to go through the trouble to get a real domain, you can instead use "magic" dns xip.io. The key is to get the external IP for your cluster.

kubectl get svc istio-ingressgateway --namespace istio-system

Look for the EXTERNAL-IP column's value(in this case 35.237.217.209)

NAME                   TYPE           CLUSTER-IP     EXTERNAL-IP      PORT(S)                                                                                                                                      AGE
istio-ingressgateway   LoadBalancer   10.51.253.94   35.237.217.209

Next step is to setting up the custom domain:

kubectl edit cm config-domain --namespace knative-serving

Now in your editor, change example.com to {{external-ip}}.xip.io (make sure to replace {{external-ip}} with the IP you found earlier).

With the change applied you can now directly curl the URL

curl -v http://sklearn-iris.kserve-test.35.237.217.209.xip.io/v1/models/sklearn-iris:predict -d @./iris-input.json

From Ingress gateway with HOST Header

If you do not have DNS, you can still curl with the ingress gateway external IP using the HOST Header.

SERVICE_HOSTNAME=$(kubectl get inferenceservice sklearn-iris -n kserve-test -o jsonpath='{.status.url}' | cut -d "/" -f 3)
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/sklearn-iris:predict -d @./iris-input.json

From local cluster gateway

If you are calling from in cluster you can curl with the internal url with host {{InferenceServiceName}}.{{namespace}}

curl -v http://sklearn-iris.kserve-test/v1/models/sklearn-iris:predict -d @./iris-input.json

6. Run Performance Test

# use kubectl create instead of apply because the job template is using generateName which doesn't work with kubectl apply
kubectl create -f https://raw.githubusercontent.com/kserve/kserve/release-0.7/docs/samples/v1beta1/sklearn/v1/perf.yaml -n kserve-test

Expected Outpout

kubectl logs load-test8b58n-rgfxr -n kserve-test
Requests      [total, rate, throughput]         30000, 500.02, 499.99
Duration      [total, attack, wait]             1m0s, 59.998s, 3.336ms
Latencies     [min, mean, 50, 90, 95, 99, max]  1.743ms, 2.748ms, 2.494ms, 3.363ms, 4.091ms, 7.749ms, 46.354ms
Bytes In      [total, mean]                     690000, 23.00
Bytes Out     [total, mean]                     2460000, 82.00
Success       [ratio]                           100.00%
Status Codes  [code:count]                      200:30000
Error Set:
sungsoo commented 2 years ago

Run your first InferenceService

KFServing InferenceService ๋ฐฐํฌ์™€ ์˜ˆ์ธก

KFServing - Deep dive

์„œ๋ฒ„๋ฆฌ์Šค๋ž€?

์„œ๋ฒ„๋ฆฌ์Šค(serverless)๋ž€ ๊ฐœ๋ฐœ์ž๊ฐ€ ์„œ๋ฒ„๋ฅผ ๊ด€๋ฆฌํ•  ํ•„์š” ์—†์ด ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์„ ๋นŒ๋“œํ•˜๊ณ  ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•˜๋Š” ํด๋ผ์šฐ๋“œ ๋„ค์ดํ‹ฐ๋ธŒ ๊ฐœ๋ฐœ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.

Python SDK for building, training, and deploying ML models

Overview of Kubeflow Fairing

Kubeflow Fairing is a Python package that streamlines the process of building, training, and deploying machine learning (ML) models in a hybrid cloud environment. By using Kubeflow Fairing and adding a few lines of code, you can run your ML training job locally or in the cloud, directly from Python code or a Jupyter notebook. After your training job is complete, you can use Kubeflow Fairing to deploy your trained model as a prediction endpoint.

Use Kubeflow Fairing SDK

To install the SDK:

pip install kubeflow-fairing

To quick start, you can run the E2E MNIST sample.

Documentation

To learn how Kubeflow Fairing streamlines the process of training and deploying ML models in the cloud, read the Kubeflow Fairing documentation.

To learn the Kubeflow Fairing SDK API, read the HTML documentation.

sungsoo commented 2 years ago

Getting Started with KServe

Install the KServe "Quickstart" environment

You can get started with a local deployment of KServe by using KServe Quick installation script on Kind:

First, download quick_install.sh file.

wget https://raw.githubusercontent.com/kserve/kserve/release-0.8/hack/quick_install.sh

Insert the following relevant shell at the first line in quick_install.sh file. In my case, I use zsh. So, I inserted as the following.

#!/usr/bin/zsh
...
set -e
############################################################
# Help                                                     #
############################################################
Help()
...

Then, execute the shell file

(base) โ•ญโ”€sungsoo@sungsoo-HP-Z840 ~/kubeflow
โ•ฐโ”€$ quick_install.sh

You can see the following console outputs.

Downloading istio-1.9.0 from https://github.com/istio/istio/releases/download/1.9.0/istio-1.9.0-linux-amd64.tar.gz ...

Istio 1.9.0 Download Complete!

Istio has been successfully downloaded into the istio-1.9.0 folder on your system.

Next Steps:
See https://istio.io/latest/docs/setup/install/ to add Istio to your Kubernetes cluster.

To configure the istioctl client tool for your workstation,
add the /home/sungsoo/kubeflow/istio-1.9.0/bin directory to your environment path variable with:
     export PATH="$PATH:/home/sungsoo/kubeflow/istio-1.9.0/bin"

Begin the Istio pre-installation check by running:
     istioctl x precheck

Need more information? Visit https://istio.io/latest/docs/setup/install/
namespace/istio-system unchanged
โœ” Istio core installed
โœ” Istiod installed
- Processing resources for Ingress gateways. Waiting for Deployment/istio-system/istio-ingressgateway
sungsoo commented 2 years ago

์ฃผ์š” ์˜ค๋ฅ˜

Warning: apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition
customresourcedefinition.apiextensions.k8s.io/ambassadorinstallations.getambassador.io created
error: .status.conditions accessor error: <nil> is of the type <nil>, expected []interface{}

microk8s ์žฌ์„ค์น˜ ๊ฒฝ๊ณ  ๋ฉ”์„ธ์ง€

ํŒŒ๋“œ ์‹œํ๋ฆฌํ‹ฐ ํด๋ฆฌ์‹œ

ํŒŒ๋“œ์‹œํ๋ฆฌํ‹ฐํด๋ฆฌ์‹œ(PodSecurityPolicy)๋Š” ์ฟ ๋ฒ„๋„คํ‹ฐ์Šค v1.21๋ถ€ํ„ฐ ๋” ์ด์ƒ ์‚ฌ์šฉ๋˜์ง€ ์•Š์œผ๋ฉฐ, v1.25์—์„œ ์ œ๊ฑฐ๋  ์˜ˆ์ •์ด๋‹ค.

Warning: policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+

Juju ์‚ญ์ œ ๊ด€๋ จ

Removal terms

There is a distinction between the similar sounding commands unregister, detach, remove, destroy, and kill. These commands are ordered such that their effect increases in severity:

These command terms/prefixes do not apply to all commands in a generic way. The explanations above are merely intended to convey how a command generally operates and what its severity level is.

Forcing removals

Juju object removal commands do not succeed when there are errors in the multiple steps that are required to remove the underlying object. For instance, a unit will not remove properly if it has a hook error, or a model cannot be removed if application units are in an error state. This is an intentionally conservative approach to the deletion of things.

However, this policy can also be a source of frustration for users in certain situations (i.e. โ€œI donโ€™t care, I just want my model gone!โ€). Because of this, several commands have a --force option.

Furthermore, even when utilising the --force option, the process may take more time than an administrator is willing to accept (i.e. โ€œJust go away as quickly as possible!โ€). Because of this, several commands that support the --force option have, in addition, support for a --no-wait option.

Caution: The --force and --no-wait options should be regarded as tools to wield as a last resort. Using them introduces a chance of associated parts (e.g., relations) not being cleaned up, which can lead to future problems.

As of v.2.6.1, this is the state of affairs for those commands that support at least the --force option:

command | --force | --no-wait -- | -- | -- destroy-model | yes | yes detach-storage | yes | no remove-application | yes | yes remove-machine | yes | yes remove-offer | yes | no remove-relation | yes | no remove-storage | yes | no remove-unit | yes | yes

When a command has --force but not --no-wait, this means that the combination of those options simply does not apply.

[Removal terms](https://juju.is/docs/olm/removing-things#heading--removal-terms) There is a distinction between the similar sounding commands unregister, detach, remove, destroy, and kill. These commands are ordered such that their effect increases in severity: Unregister means to decouple a resource from a logical entity for the client. The effect is local to the client only and does not affect the logical entity in any way. Detach means to decouple a resource from a logical entity (such as an application). The resource will remain available and the underlying cloud resources used by it also remain in place. Remove means to cleanly remove a single logical entity. This is a destructive process, meaning the entity will no longer be available via Juju, and any underlying cloud resources used by it will be freed (however, this can often be overridden on a case-by-case basis to leave the underlying cloud resources in place). Destroy means to cleanly tear down a logical entity, along with everything within these entities. This is a very destructive process. Kill means to forcibly tear down an unresponsive logical entity, along with everything within it. This is a very destructive process that does not guarantee associated resources are cleaned up. These command terms/prefixes do not apply to all commands in a generic way. The explanations above are merely intended to convey how a command generally operates and what its severity level is. [Forcing removals](https://juju.is/docs/olm/removing-things#heading--forcing-removals) Juju object removal commands do not succeed when there are errors in the multiple steps that are required to remove the underlying object. For instance, a unit will not remove properly if it has a hook error, or a model cannot be removed if application units are in an error state. This is an intentionally conservative approach to the deletion of things. However, this policy can also be a source of frustration for users in certain situations (i.e. โ€œI donโ€™t care, I just want my model gone!โ€). Because of this, several commands have a --force option. Furthermore, even when utilising the --force option, the process may take more time than an administrator is willing to accept (i.e. โ€œJust go away as quickly as possible!โ€). Because of this, several commands that support the --force option have, in addition, support for a --no-wait option. Caution: The --force and --no-wait options should be regarded as tools to wield as a last resort. Using them introduces a chance of associated parts (e.g., relations) not being cleaned up, which can lead to future problems. As of v.2.6.1, this is the state of affairs for those commands that support at least the --force option: command --force --no-wait destroy-model yes yes detach-storage yes no remove-application yes yes remove-machine yes yes remove-offer yes no remove-relation yes no remove-storage yes no remove-unit yes yes When a command has --force but not --no-wait, this means that the combination of those options simply does not apply.
sungsoo commented 2 years ago

Juju deploy ์‹œ ์˜ค๋ฅ˜

Juju deploy ๋ช…๋ น ์‹คํ–‰ ํ›„, ์•„๋ž˜์™€ ๊ฐ™์€ ์˜ค๋ฅ˜๊ฐ€ ์ƒ๊ธธ ๋•Œ,

(base) โ•ญโ”€sungsoo@sungsoo-HP-Z840 ~
โ•ฐโ”€$ juju deploy kubeflow --trust                                                                                                                                  1 โ†ต
ERROR The charm or bundle "kubeflow" is ambiguous.

๋‹ค์Œ๊ณผ ๊ฐ™์ด ํ•ด๋‹น ์†Œ์Šค์— ๋Œ€ํ•œ ๋„ค์ž„์ŠคํŽ˜์ด์Šค๋ฅผ ๋„ฃ์–ด์„œ ์‹คํ–‰ํ•˜์ž.

(base) โ•ญโ”€sungsoo@sungsoo-HP-Z840 ~
โ•ฐโ”€$ juju deploy cs:kubeflow --trust

Juju uninstallation

# Hard reinstall of clients
snap remove --purge  juju
rm -rf ~/.local/share/juju
snap install juju --classic

# Hard re-install of controllers or machines needs a bit more
# Gladly juju leaves a helper to do so
$ sudo /usr/sbin/remove-juju-services
sungsoo commented 2 years ago

KServe: ๊ฒฌ๊ณ ํ•˜๊ณ  ํ™•์žฅ ๊ฐ€๋Šฅํ•œ ํด๋ผ์šฐ๋“œ ๋„ค์ดํ‹ฐ๋ธŒ ๋ชจ๋ธ ์„œ๋ฒ„


Kubeflow์— ์ต์ˆ™ํ•˜๋‹ค๋ฉด KFServing์„ ํ”Œ๋žซํผ์˜ ๋ชจ๋ธ ์„œ๋ฒ„ ๋ฐ ์ถ”๋ก  ์—”์ง„์œผ๋กœ ์•Œ๊ณ  ์žˆ์„ ๊ฒƒ์ด๋‹ค. 2021๋…„ 9์›” KFServing ํ”„๋กœ์ ํŠธ๋Š” KServe๋กœ ๋ณ€๋ชจํ–ˆ๋‹ค.

KServe๋Š” ํ˜„์žฌ Kubeflow ํ”„๋กœ์ ํŠธ๋ฅผ ์กธ์—…ํ•œ ๋…๋ฆฝ ์ปดํฌ๋„ŒํŠธ์ด๋ฉฐ ๋ช…์นญ ๋ณ€๊ฒฝ์€ ๋ณ„๊ฐœ์ด๋‹ค. ์ด๋Ÿฌํ•œ ๋ถ„๋ฆฌ๋ฅผ ํ†ตํ•ด KServe๋Š” ๋…๋ฆฝํ˜• ๋ชจ๋ธ ์„œ๋ฒ„๋กœ ๊ตฌ์ถ•๋œ ๋ณ„๋„์˜ ํด๋ผ์šฐ๋“œ ๋„ค์ดํ‹ฐ๋ธŒ ์ถ”๋ก  ์—”์ง„์œผ๋กœ ๋ฐœ์ „ํ•  ์ˆ˜ ์žˆ๋‹ค. ๋ฌผ๋ก  Kubeflow์™€์˜ ๊ธด๋ฐ€ํ•œ ํ†ตํ•ฉ์€ ๊ณ„์†๋˜๊ฒ ์ง€๋งŒ, ๋…๋ฆฝ์ ์ธ ์˜คํ”ˆ ์†Œ์Šค ํ”„๋กœ์ ํŠธ๋กœ ์ทจ๊ธ‰๋˜๊ณ  ์œ ์ง€๋œ๋‹ค.

KServe๋Š” Google, IBM, Bloomberg, Nvidia ๋ฐ Seldon์ด Kubernetes์˜ ์˜คํ”ˆ ์†Œ์Šค ํด๋ผ์šฐ๋“œ ๋„ค์ดํ‹ฐ๋ธŒ ๋ชจ๋ธ ์„œ๋ฒ„๋กœ ๊ณต๋™์œผ๋กœ ๊ฐœ๋ฐœํ–ˆ๋‹ค. ์ตœ์‹  ๋ฒ„์ „์ธ 0.8์—์„œ๋Š” ๋ถ„๋ฅ˜๋ฒ• ๋ฐ ๋ช…๋ช…๋ฒ•์ด ๋ณ€๊ฒฝ๋˜์–ด ๋ชจ๋ธ ์„œ๋ฒ„๋ฅผ ๋…๋ฆฝํ˜• ์ปดํฌ๋„ŒํŠธ๋กœ ์ „ํ™˜ํ•˜๋Š” ๋ฐ ์ดˆ์ ์„ ๋งž์ท„๋‹ค.

KServe์˜ ํ•ต์‹ฌ ๊ธฐ๋Šฅ์— ๋Œ€ํ•ด ์•Œ์•„๋ณด๊ฒ ๋‹ค.

๋ชจ๋ธ ์„œ๋ฒ„๋Š” ๋จธ์‹ ๋Ÿฌ๋‹ ๋ชจ๋ธ์—๊ฒŒ ์–ดํ”Œ๋ฆฌ์ผ€์ด์…˜์ด ๋ฐ”์ด๋„ˆ๋ฆฌ๋ฅผ ์ฝ”๋“œํ™”ํ•˜๋Š” ๊ฒƒ๊ณผ ๊ฐ™์€ ์—ญํ• ์„ ํ•œ๋‹ค. ๋‘˜ ๋‹ค ๋ฐฐํฌ์— ๋Ÿฐํƒ€์ž„ ๋ฐ ์‹คํ–‰ ์ปจํ…์ŠคํŠธ๋ฅผ ์ œ๊ณตํ•œ๋‹ค. KServe๋Š” ๋ชจ๋ธ ์„œ๋ฒ„๋กœ์„œ ๋จธ์‹ ๋Ÿฌ๋‹ ๋ฐ ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์„ ๊ทœ๋ชจ์— ๋งž๊ฒŒ ์ œ๊ณตํ•  ์ˆ˜ ์žˆ๋Š” ๊ธฐ๋ฐ˜์„ ์ œ๊ณตํ•œ๋‹ค.

KServe๋Š” ๊ธฐ์กด Kubernetes ๋ฐฐํฌ ๋˜๋Š” scale-to-zero๋ฅผ ์ง€์›ํ•˜๋Š” ์„œ๋ฒ„๋ฆฌ์Šค๋กœ ๋ฐฐํฌํ•  ์ˆ˜ ์žˆ๋‹ค. ์„œ๋ฒ„๋ฆฌ์Šค์—์„œ๋Š” ์ž๋™ ์Šค์ผ€์ผ์—… ๋ฐ ์Šค์ผ€์ผ๋‹ค์šด ๊ธฐ๋Šฅ์„ ๊ฐ–์ถ˜ ์„œ๋ฒ„๋ฆฌ์Šค์šฉ Knative Serving๋ฅผ ํ™œ์šฉํ•œ๋‹ค. Istio๋Š” ์„œ๋น„์Šค ์—”๋“œํฌ์ธํŠธ๋ฅผ API ์†Œ๋น„์ž์—๊ฒŒ ๊ณต๊ฐœํ•˜๊ธฐ ์œ„ํ•œ ์ธ๊ทธ๋ ˆ์Šค๋กœ ์‚ฌ์šฉ๋œ๋‹ค. Istio์™€ Knative Serving์˜ ์กฐํ•ฉ์œผ๋กœ ๋ชจ๋ธ์˜ ๋ธ”๋ฃจ/๊ทธ๋ฆฐ ๋ฐ ์นด๋‚˜๋ฆฌ ๋ฐฐํฌ์™€ ๊ฐ™์€ ํฅ๋ฏธ๋กœ์šด ์‹œ๋‚˜๋ฆฌ์˜ค๊ฐ€ ๊ฐ€๋Šฅํ•˜๋‹ค.

KServe๋ฅผ Knative Serving ์—†์ด ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” RawDeployment Mode๋Š” HPA(์ˆ˜ํ‰ ํฌ๋“œ ์ž๋™์Šค์ผ€์ผ๋Ÿฌ)์™€ ๊ฐ™์€ ๊ธฐ์กด ์Šค์ผ€์ผ๋ง ๊ธฐ์ˆ ์„ ์ง€์›ํ•˜์ง€๋งŒ scale-to-zero๋Š” ์ง€์›ํ•˜์ง€ ์•Š๋Š”๋‹ค.

KServe ์•„ํ‚คํ…์ฒ˜

KServe ๋ชจ๋ธ ์„œ๋ฒ„์—๋Š” ์ปจํŠธ๋กค ํ”Œ๋ ˆ์ธ๊ณผ ๋ฐ์ดํ„ฐ ํ”Œ๋ ˆ์ธ์ด ์žˆ๋‹ค. ์ปจํŠธ๋กค ํ”Œ๋ ˆ์ธ์€ ์ถ”๋ก ์„ ๋‹ด๋‹นํ•˜๋Š” ์ปค์Šคํ…€๋ฆฌ์†Œ์Šค๋ฅผ ๊ด€๋ฆฌํ•˜๊ณ  ์กฐ์ •ํ•œ๋‹ค. ์„œ๋ฒ„๋ฆฌ์Šค ๋ชจ๋“œ์—์„œ๋Š” Knative ๋ฆฌ์†Œ์Šค์™€ ์—ฐ๊ณ„ํ•˜์—ฌ ์ž๋™ ์Šค์ผ€์ผ์„ ๊ด€๋ฆฌํ•œ๋‹ค.

KServe ์ปจํŠธ๋กค ํ”Œ๋ ˆ์ธ์˜ ์ค‘์‹ฌ์—๋Š” ์ถ”๋ก  ์„œ๋น„์Šค์˜ ๋ผ์ดํ”„ ์‚ฌ์ดํด์„ ๊ด€๋ฆฌํ•˜๋Š” KServe ์ปจํŠธ๋กค๋Ÿฌ๊ฐ€ ์žˆ๋‹ค. ์„œ๋น„์Šค, ์ธ๊ทธ๋ ˆ์Šค ๋ฆฌ์†Œ์Šค, ๋ชจ๋ธ ์„œ๋ฒ„ ์ปจํ…Œ์ด๋„ˆ, ์š”์ฒญ/์‘๋‹ต ๋กœ๊น…์„ ์œ„ํ•œ ๋ชจ๋ธ ์—์ด์ „ํŠธ ์ปจํ…Œ์ด๋„ˆ, ๋ฐฐ์น˜ ๋ฐ ๋ชจ๋ธ ์ €์žฅ์†Œ์—์„œ ๋ชจ๋ธ์„ ํ’€๋ฆผ ์—…๋ฌด๋ฅผ ๋‹ด๋‹นํ•œ๋‹ค. ๋ชจ๋ธ ์ €์žฅ์†Œ๋Š” ๋ชจ๋ธ ์„œ๋ฒ„์— ๋“ฑ๋ก๋œ ๋ชจ๋ธ์˜ ์ €์žฅ์†Œ์ด๋‹ค. ์ผ๋ฐ˜์ ์œผ๋กœ Amazon S3, Google Cloud Storage, Azure Storage ๋˜๋Š” MinIO์™€ ๊ฐ™์€ ์˜ค๋ธŒ์ ํŠธ ์Šคํ† ๋ฆฌ์ง€ ์„œ๋น„์Šค์ด๋‹ค.

๋ฐ์ดํ„ฐ ํ”Œ๋ ˆ์ธ์€ ํŠน์ • ๋ชจ๋ธ์„ ๋Œ€์ƒ์œผ๋กœ ํ•˜๋Š” ์š”์ฒญ/์‘๋‹ต ์ฃผ๊ธฐ๋ฅผ ๊ด€๋ฆฌํ•œ๋‹ค. ์—ฌ๊ธฐ์—๋Š” predictor, transformer, explainer์ด ์žˆ๋‹ค.

AI ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์€ REST ๋˜๋Š” gRPC ์š”์ฒญ๋ฅผ predictor ์—”๋“œํฌ์ธํŠธ๋กœ ์ „์†กํ•œ๋‹ค. predictor๋Š” transformer ์ปดํฌ๋„ŒํŠธ๋ฅผ ํ˜ธ์ถœํ•˜๋Š” ์ถ”๋ก  ํŒŒ์ดํ”„๋ผ์ธ์œผ๋กœ์„œ ์ž‘๋™ํ•œ๋‹ค. transformer ์ปดํฌ๋„ŒํŠธ๋Š” inbound ๋ฐ์ดํ„ฐ์˜ ์ „์ฒ˜๋ฆฌ(์š”์ฒญ)์™€ outbound ๋ฐ์ดํ„ฐ์˜ ํ›„์ฒ˜๋ฆฌ(์‘๋‹ต)๋ฅผ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ๋‹ค. ์˜ต์…˜์œผ๋กœ ํ˜ธ์ŠคํŠธ๋œ ๋ชจ๋ธ์— AI ์„ค๋ช… ๊ฐ€๋Šฅ์„ฑ์„ ์ œ๊ณตํ•˜๋Š” explainer ์ปดํฌ๋„ŒํŠธ๊ฐ€ ์žˆ์„ ์ˆ˜ ์žˆ๋‹ค. KServe๋Š” ์ƒํ˜ธ ์šด์šฉ์„ฑ๊ณผ ํ™•์žฅ์ด ๊ฐ€๋Šฅํ•œ V2 ํ”„๋กœํ† ์ฝœ์˜ ์‚ฌ์šฉ์„ ๊ถŒ์žฅํ•œ๋‹ค.

๋ฐ์ดํ„ฐ ํ”Œ๋ ˆ์ธ์—๋Š” ๋ชจ๋ธ์˜ ์ค€๋น„ ์ƒํƒœ์™€ ์•„์ƒ ์กด์žฌ ์—ฌ๋ถ€ ์ƒํƒœ๋ฅผ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋Š” ์—”๋“œํฌ์ธํŠธ๋„ ์žˆ๋‹ค. ๋˜ํ•œ ๋ชจ๋ธ ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ๋ฅผ ๊ฒ€์ƒ‰ํ•˜๊ธฐ ์œ„ํ•œ API๋„ ์ œ๊ณตํ•œ๋‹ค.

์ง€์›๋˜๋Š” ํ”„๋ ˆ์ž„์›Œํฌ ๋ฐ ๋Ÿฐํƒ€์ž„

KServe๋Š” ๊ด‘๋ฒ”์œ„ํ•œ ๋จธ์‹ ๋Ÿฌ๋‹ ๋ฐ ๋”ฅ๋Ÿฌ๋‹ ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์ง€์›ํ•œ๋‹ค. ๋”ฅ๋Ÿฌ๋‹ ํ”„๋ ˆ์ž„์›๊ณผ ๋Ÿฐํƒ€์ž„์€ TensorFlow Serving, TorchServe, Triton Inference Server์™€ ๊ฐ™์€ ๊ธฐ์กด์˜ ์„œ๋น™ ์ธํ”„๋ผ์™€ ํ•จ๊ป˜ ์ž‘๋™ํ•œ๋‹ค. KServe๋Š” Triton์„ ํ†ตํ•ด TensorFlow, ONNX, PyTorch, TensorRT๋ฅผ ํ˜ธ์ŠคํŠธํ•  ์ˆ˜ ์žˆ๋‹ค.

SKLearn, XGBoost, Spark MLLib ๋ฐ LightGBM KServe๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•˜๋Š” ๊ธฐ์กด ๋จธ์‹ ๋Ÿฌ๋‹ ๋ชจ๋ธ์˜ ๊ฒฝ์šฐ Seldon์˜ MLServer๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค.

KServe์˜ ํ™•์žฅ ๊ฐ€๋Šฅํ•œ ํ”„๋ ˆ์ž„์›Œํฌ๋Š” V2 ์ถ”๋ก  ํ”„๋กœํ† ์ฝœ์„ ์ค€์ˆ˜ํ•˜๋Š” ๋ชจ๋“  ๋Ÿฐํƒ€์ž„์— ํ”Œ๋Ÿฌ๊ทธ์ธํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•œ๋‹ค.

ModelMesh์™€ ํ•จ๊ป˜ ์ œ๊ณต๋˜๋Š” ๋ฉ€ํ‹ฐ๋ชจ๋ธ ์„œ๋น™

KServe๋Š” ์ถ”๋ก ๋‹น 1๊ฐœ์˜ ๋ชจ๋ธ์„ ๋„์ž…ํ•˜์—ฌ ํ”Œ๋žซํผ์˜ ํ™•์žฅ์„ฑ์„ ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ CPU ๋ฐ GPU๋กœ ์ œํ•œํ•œ๋‹ค. ์ด ์ œํ•œ์€ ๋น„์šฉ์ด ๋งŽ์ด ๋“ค๊ณ  ์ปดํ“จํŒ… ๋ฆฌ์†Œ์Šค๊ฐ€ ๋ถ€์กฑํ•œ GPU์—์„œ ์ถ”๋ก ์„ ์‹คํ–‰ํ•  ๋•Œ ๋ช…๋ฐฑํ•ด์ง„๋‹ค.

๋ฉ€ํ‹ฐ๋ชจ๋ธ ์„œ๋น„์Šค๋ฅผ ์ด์šฉํ•˜๋ฉด ์ปดํ“จํŒ… ๋ฆฌ์†Œ์Šค, ์ตœ๋Œ€ ํŒŒ๋“œ, ์ตœ๋Œ€ IP ์ฃผ์†Œ ๋“ฑ ์ธํ”„๋ผ์˜ ์ œ์•ฝ์„ ๊ทน๋ณตํ•  ์ˆ˜ ์žˆ๋‹ค.

IBM์ด ๊ฐœ๋ฐœํ•œ ModelMesh Serving์€ ML/DL ๋ชจ๋ธ์„ ์‹ค์‹œ๊ฐ„์œผ๋กœ ์ œ๊ณตํ•˜๊ธฐ ์œ„ํ•œ Kubernetes ๊ธฐ๋ฐ˜ ํ”Œ๋žซํผ์œผ๋กœ, ๋†’์€ volume/density ์‚ฌ์šฉ ์‚ฌ๋ก€์— ์ตœ์ ํ™”๋˜์–ด ์žˆ๋‹ค. ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ๋ฆฌ์†Œ์Šค๋ฅผ ์ตœ์ ์œผ๋กœ ํ™œ์šฉํ•˜๊ธฐ ์œ„ํ•ด ํ”„๋กœ์„ธ์Šค๋ฅผ ๊ด€๋ฆฌํ•˜๋Š” ์šด์˜ ์ฒด์ œ์™€ ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ ModelMesh๋Š” ํด๋Ÿฌ์Šคํ„ฐ ๋‚ด์—์„œ ํšจ์œจ์ ์œผ๋กœ ์‹คํ–‰๋˜๋„๋ก ๋ฐฐํฌ๋œ ๋ชจ๋ธ์„ ์ตœ์ ํ™”ํ•œ๋‹ค.

๋ฐฐํฌ๋œ ํŒŒ๋“œ์˜ ํด๋Ÿฌ์Šคํ„ฐ ์ „์ฒด์—์„œ ์ธ๋ฉ”๋ชจ๋ฆฌ ๋ชจ๋ธ ๋ฐ์ดํ„ฐ๋ฅผ ์ธํ…”๋ฆฌ์ „ํŠธํ•˜๊ฒŒ ๊ด€๋ฆฌํ•˜๊ณ , ์ด๋Ÿฌํ•œ ๋ชจ๋ธ์„ ์žฅ๊ธฐ๊ฐ„์— ๊ฑธ์ณ ์‚ฌ์šฉํ•จ์œผ๋กœ์จ ์‹œ์Šคํ…œ์€ ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ํด๋Ÿฌ์Šคํ„ฐ ๋ฆฌ์†Œ์Šค๋ฅผ ์ตœ๋Œ€ํ•œ ํ™œ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค.

ModelMesh Serving์€ KServe v2 ๋ฐ์ดํ„ฐ ํ”Œ๋ ˆ์ธ API์— ๊ธฐ๋ฐ˜ํ•˜์—ฌ NVIDIA Triton Inference Server์™€ ์œ ์‚ฌํ•œ ๋Ÿฐํƒ€์ž„์œผ๋กœ ๋ฐฐํฌํ•  ์ˆ˜ ์žˆ๋‹ค. ์š”์ฒญ์ด KServe ๋ฐ์ดํ„ฐ ํ”Œ๋ ˆ์ธ์— ๋„๋‹ฌํ•˜๋ฉด, ModelMesh Serving์— ์œ„์ž„๋œ๋‹ค.

ModelMesh Serving๊ณผ KServe์˜ ํ†ตํ•ฉ์€ ํ˜„์žฌ Alpha ํ…Œ์ŠคํŠธ ๋‹จ๊ณ„์— ์žˆ๋‹ค. ๋‘ ํ”„๋กœ์ ํŠธ๊ฐ€ ์„ฑ์ˆ™ํ•จ์— ๋”ฐ๋ผ ํ†ตํ•ฉ์ด ๊ฐ•ํ™”๋˜์–ด ๋‘ ํ”Œ๋žซํผ์˜ ํŠน์ง•๊ณผ ๊ธฐ๋Šฅ์„ ํ˜ผ์žฌ์‹œํ‚ฌ ์ˆ˜ ์žˆ๊ฒŒ ๋œ๋‹ค.

๋ชจ๋ธ ์„œ๋น™์ด MLOps์˜ ํ•ต์‹ฌ ๋นŒ๋”ฉ ๋ธ”๋ก์ด ๋˜๋ฉด์„œ KServe์™€ ๊ฐ™์€ ์˜คํ”ˆ ์†Œ์Šค ํ”„๋กœ์ ํŠธ๊ฐ€ ์ค‘์š”ํ•ด์กŒ๋‹ค. KServe๋Š” ๊ธฐ์กด ๋Ÿฐํƒ€์ž„๊ณผ ํ–ฅํ›„ ๋Ÿฐํƒ€์ž„์— ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ํ™•์žฅ์„ฑ์ด ๊ฐ€๋Šฅํ•œ ๊ณ ์œ ํ•œ ๋ชจ๋ธ ์„œ๋น™ ํ”Œ๋žซํผ์ด๋‹ค.

https://github.com/kserve/kserve https://www.kubeflow.org/docs/external-add-ons/kserve/kserve/ https://kserve.github.io/website/0.8/

sungsoo commented 2 years ago

Kserve Istio dex ์šฐํšŒํ•˜๊ธฐ

Article Source


์š”์ฆˆ์Œ kubeflow ๋“ฑ MLOps์ ์ธ ๋ถ€๋ถ„๋“ค์„ ํšŒ์‚ฌ์—์„œ ์ž‘์—…ํ•˜๊ณ  ์žˆ๋‹ค. ์›๋ž˜๋Š” ๋ชจ๋ธ ๋ฐฐํฌ ์ชฝ์€ ๊ธฐ์กด ๋ฐฉ์‹๋Œ€๋กœ ์ง„ํ–‰ํ•˜๋ ค ํ–ˆ์ง€๋งŒ ๋ฐ์ดํ„ฐ ๋ถ„์„ํŒ€์—์„œ ๋ชจ๋ธ ๋ฐฐํฌ ๊ณผ์ •์„ ๋น ๋ฅด๊ฒŒ ์ง„ํ–‰ํ•˜๊ณ  ์‹ถ์–ด ํ•ด kserve๋„ ํ•จ๊ป˜ ์‚ฌ์šฉํ•˜๊ธฐ๋กœ ํ–ˆ๋‹ค. ์˜จํ”„๋ ˆ๋ฏธ์Šค ํ™˜๊ฒฝ์—์„œ ๊ด€๋ จํ•œ ํ…Œ์ŠคํŠธ๋ฅผ ์ง„ํ–‰ํ•˜๋‹ค dex ์ธ์ฆ ๊ด€๋ จ ๋ฌธ์ œ๋ฅผ ๋งŒ๋‚˜ ์ด๋ฅผ ํ•ด๊ฒฐํ•˜๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•ด ๊ฐ„๋‹จํ•˜๊ฒŒ ์ •๋ฆฌํ•œ๋‹ค.

kubeflow์™€ istio ๊ตฌ์„ฑ, ๊ณต์‹๋ฌธ์„œ

Kubeflow๋ฅผ ๋ฐฐํฌํ•˜๋ฉด์„œ istio์™€ dex๋ฅผ ํ•จ๊ป˜ ๋ฐฐํฌํ–ˆ๋‹ค. istio๋Š” ์„œ๋น„์Šค ๊ฐ„์˜ ์—ฐ๊ฒฐ์„ ์œ„ํ•ด์„œ ์‚ฌ์šฉํ•˜๊ณ , dex๋Š” ์ธ์ฆ์„ ์œ„ํ•ด์„œ ์‚ฌ์šฉํ•œ๋‹ค. istio๋ฅผ port forwardํ•ด์„œ kubeflow dashboard์— ์ ‘์†ํ•ด๋ณด๋ฉด ๊ฐ€์žฅ ๋จผ์ € dex login ์ฐฝ์ด ์—ฐ๊ฒฐ๋œ๋‹ค. ๊ทธ๋Ÿฌ๋‹ˆ๊นŒ istio ๊ฒŒ์ดํŠธ์›จ์ด์— ์—ฐ๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ์ด ์ธ์ฆ ์ •๋ณด๊ฐ€ ํ•„์š”ํ•œ ๊ฒƒ์ด๋‹ค.

kserve๋ฅผ servelessํ•œ ๊ตฌ์„ฑ์œผ๋กœ ๋ฐฐํฌํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” knative๋ฅผ ํ•จ๊ป˜ ๋ฐฐํฌํ•ด์•ผ ํ•œ๋‹ค. ์ด knative๋Š” ๋‹ค์‹œ istio๋ฅผ ์ด์šฉํ•ด ์„œ๋กœ๋ฅผ ์—ฐ๊ฒฐํ•œ๋‹ค. ๋ฌธ์ œ๋Š” ์—ฌ๊ธฐ์„œ ๋ฐœ์ƒํ•˜๋Š”๋ฐ api ์š”์ฒญ์ด istio ๊ฒŒ์ดํŠธ์›จ์ด๋ฅผ ๊ฑฐ์น˜๋ฉด์„œ ์ธ์ฆ ์ •๋ณด๊ฐ€ ํ•„์š”ํ•œ ๊ฒƒ์ด๋‹ค. ํด๋Ÿฌ์Šคํ„ฐ ๋ฐ”๊นฅ์—์„œ ์—ฐ๊ฒฐํ•˜๋Š” ๊ฒฝ์šฐ์—๋งŒ ์ธ์ฆ์„ ์š”๊ตฌํ•˜๋ฉด ๊ดœ์ฐฎ์€๋ฐ, ํด๋Ÿฌ์Šคํ„ฐ ๋‚ด์—์„œ ์„œ๋น„์Šค๋ฅผ ํ†ตํ•ด ์—ฐ๊ฒฐํ•ด๋„ ์ด๋Ÿฌํ•œ ์ธ์ฆ์„ ์š”๊ตฌํ•œ๋‹ค.

์„ค์น˜

kubeflow ๋ฐฐํฌ๋Š” ๋ชจ๋‘์˜ MLOps๋ฅผ ์ฐธ์กฐํ–ˆ๋‹ค.

kserve ์„ค์น˜์˜ ๊ฒฝ์šฐ์—๋Š” ๊ณต์‹ ๋ฌธ์„œ๋ฅผ ์ฐธ์กฐํ•ด ์ง„ํ–‰ํ–ˆ๋‹ค. ์ด๋•Œ kubeflow ๋ฐฐํฌ ๊ณผ์ •์—์„œ ์ด๋ฏธ istio๊ฐ€ ๋ฐฐํฌ๋˜์–ด ์žˆ์œผ๋ฏ€๋กœ istio ๋ฐฐํฌ๋Š” ์ œ์™ธํ•˜๊ณ  ์ง„ํ–‰ํ–ˆ๋‹ค.

๋ฌธ์ œ

์šฐ์„  ํด๋Ÿฌ์Šคํ„ฐ ๋‚ด์— ์•„๋ฌด ๋™์ž‘๋„ ํ•˜์ง€ ์•Š๋Š” ๋‹จ์ˆœํ•œ ํŒŒ๋“œ๋ฅผ ํ•˜๋‚˜ ์ƒ์„ฑํ•ด๋ณด์ž. ์ด ํŒŒ๋“œ์— ์—ฐ๊ฒฐํ•ด ๋‚ด๋ถ€ ์„œ๋น„์Šค๋กœ curl์„ ๋ณด๋‚ผ ๊ฒƒ์ด๋ฏ€๋กœ curl์ด ์„ค์น˜๋˜์–ด ์žˆ๋Š” ์ด๋ฏธ์ง€๋ฅผ ํŒŒ๋“œ๋กœ ๋ฐฐํฌํ•œ๋‹ค.

apiVersion: v1
kind: Pod
metadata:
    name: myapp-pod
    labels:
        app: myapp
spec:
    containers:
    - name: myapp-container
      image: curlimages/curl:7.82.0
      command: ['sh', '-c', 'echo Hello k8s! && sleep 3600']

kserve์˜ ๊ฒฝ์šฐ์—๋Š” ๊ณต์‹ ํ™ˆํŽ˜์ด์ง€์— ์žˆ๋Š” ์˜ˆ์ œ๋Œ€๋กœ ๊ฐ„๋‹จํ•œ iris ์˜ˆ์ธก ๋ชจ๋ธ์„ ๋ฐฐํฌํ•œ๋‹ค.

apiVersion: "[serving.kserve.io/v1beta1](http://serving.kserve.io/v1beta1)"
kind: "InferenceService"
metadata:
  name: "sklearn-iris"
spec:
  predictor:
    sklearn:
      storageUri: "gs://kfserving-examples/models/sklearn/1.0/model"

์„œ๋น„์Šค๋ฅผ ํ™•์ธํ•ด๋ณด๋ฉด ์ด ๋ชจ๋ธ์— ๋Œ€ํ•œ ์„œ๋น„์Šค๊ฐ€ ์กด์žฌํ•˜๋Š” ๊ฑธ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.

kubectl get svc -n kserve-test

NAME                                           TYPE           CLUSTER-IP      EXTERNAL-IP                                            PORT(S)                                      AGE
sklearn-iris                                   ExternalName   <none>          knative-local-gateway.istio-system.svc.cluster.local   <none>                                       133m

์ด์ œ ์ด ์„œ๋น„์Šค์˜ ์ด๋ฆ„์œผ๋กœ ์š”์ฒญ์„ ๋ณด๋‚ด๋ณด์ž. ์šฐ์„  ์œ„์—์„œ ์ƒ์„ฑํ•œ ํŒŒ๋“œ์— ์—ฐ๊ฒฐํ•ด์•ผํ•œ๋‹ค.

kubectl exec --stdin --tty myapp-pod -- /bin/sh

๊ทธ ๋‹ค์Œ์— ์˜ˆ์ œ์— ๋‚˜์™€ ์žˆ๋Š” jsonํŒŒ์ผ์„ ์ƒ์„ฑํ•˜๊ณ  ์„œ๋น„์Šค๋กœ ์š”์ฒญ์„ ์ „์†กํ•ด๋ณด์ž.

cat <<EOF > "./iris-input.json"
{
  "instances": [
    [6.8,  2.8,  4.8,  1.4],
    [6.0,  3.4,  4.5,  1.6]
  ]
}
EOF

curl -v http://sklearn-iris.kserve-test.svc.cluster.local/v1/models/sklearn-iris:predict -d @./iris-input.json

๊ทธ๋Ÿฌ๋ฉด ์•„๋งˆ ์‘๋‹ต ์ฝ”๋“œ๋กœ 302๋ฒˆ๊ณผ ํ•จ๊ป˜ dex ์ธ์ฆ ๊ด€๋ จํ•œ ์ •๋ณด๊ฐ€ ๋‚˜์˜ฌ ๊ฒƒ์ด๋‹ค.

์‚ฌ์‹ค ์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๋ ค๋ฉด ์š”๊ตฌํ•˜๋Š” ๋Œ€๋กœ dex ์ธ์ฆ์— ๊ด€๋ จํ•œ ์ •๋ณด๋ฅผ ํ•จ๊ป˜ ๋‹ด์•„ ์š”์ฒญ์„ ๋ณด๋‚ด๋ฉด ๋œ๋‹ค. ๊ณต์‹ ๋ ˆํฌ์— ์นœ์ ˆํ•œ ์˜ˆ์‹œ๋„ ์žˆ๋‹ค. ๋‚˜์™€์žˆ๋Š” ๋Œ€๋กœ CLI์—์„œ ์ง€์ง€๊ณ  ๋ณถ์„ ์ˆ˜๋„ ์žˆ๊ณ , ์‹ฌ์ง€์–ด๋Š” kubeflow ๋Œ€์‹œ๋ณด๋“œ์— ๋กœ๊ทธ์ธํ•˜๊ณ  ๊ฑฐ๊ธฐ์„œ ์‚ฌ์šฉํ•˜๋Š” ์ •๋ณด๋ฅผ ๊ฐ€์ ธ์™€ ํ—ค๋”์— ๋‹ด์•„ ์š”์ฒญ์„ ๋ณด๋‚ผ ์ˆ˜๋„ ์žˆ๋‹ค.

ํ•˜์ง€๋งŒ ์ด๊ฒƒ๋งŒ์œผ๋กœ ์ถฉ๋ถ„ํ• ๊นŒ? ์—ฌ๊ธฐ์„œ ๋ฌธ์ œ๋Š” istio๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋ชจ๋“  ์–ดํ”Œ๋ฆฌ์ผ€์ด์…˜์ด ์ด dex ์ •๋ณด๋ฅผ ์š”๊ตฌํ•œ๋‹ค๋Š” ๋ฐ ์žˆ๋‹ค. ๋งŒ์•ฝ ๋ฐฑ์—”๋“œ ํŒ€์—์„œ istio๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค๋ฉด ๋จธ์‹ ๋Ÿฌ๋‹ ํŒ€์—์„œ ์‚ฌ์šฉํ•˜๋Š” dex๋ฅผ ์œ„ํ•ด ๊ทธ๋•Œ๋งˆ๋‹ค ํ‚ค๋ฅผ ์ƒ์„ฑํ•ด์•ผ๋งŒ ํ• ๊นŒ? ๋น„์Šทํ•œ ๋ฌธ์ œ๋ฅผ ๊ฒช๋Š” ์‚ฌ๋žŒ๋“ค์˜ ์ด์Šˆ๋„ ์ข…์ข… ์žˆ๋Š” ๊ฒƒ ๊ฐ™๋‹ค(#1 #2, ์ฒซ๋ฒˆ์งธ๋Š” 2019๋…„์— ์˜ฌ๋ผ์˜จ ์ด์Šˆ์ง€๋งŒ ๋‘๋ฒˆ์งธ๋Š” ๋‹น์žฅ ๋ฉฐ์น  ์ „์— ์˜ฌ๋ผ์˜จ ์ด์Šˆ๋‹ค)

์›์ธ

์™œ ์ด๋Ÿฐ ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ• ๊นŒ? ์šฐ์„  istio virtual service ์ •๋ณด๋ฅผ ํ™•์ธํ•ด๋ณด์ž.

kubectl get virtualservices.networking.istio.io --all-namespaces

๊ทธ๋Ÿฌ๋ฉด dex์— ๊ด€ํ•œ ๋ฒ„์ถ”์–ผ ์„œ๋น„์Šค์™€ ์ด ์„œ๋น„์Šค๊ฐ€ ์‚ฌ์šฉํ•˜๋Š” ๊ฒŒ์ดํŠธ์›จ์ด๋ฅผ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค. dex๋Š” kubeflow์—์„œ ์ธ์ฆ์„ ์œ„ํ•ด ์‚ฌ์šฉํ•˜๋‹ˆ kubeflow-gateway์— ์—ฐ๊ฒฐ๋œ ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.

์ด๋ฒˆ์—” ์ด ๊ฒŒ์ดํŠธ์›จ์ด์˜ ์ •๋ณด๋ฅผ ํ™•์ธํ•ด๋ณด์ž.

kubectl get gateways.networking.io -n kubeflow kubeflow-gateway -o yaml
spec:
  selector:
    istio: ingressgateway
  servers:
  - hosts:
    - '*'
    port:
      name: http
      number: 80
      protocol: HTTP

๊ทธ๋Ÿฌ๋ฉด ์…€๋ ‰ํ„ฐ๋กœ ๊ธฐ๋ณธ ์ปจํŠธ๋กค๋Ÿฌ๋ฅผ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ๋Š” ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ๋‹ค. ์ด ๊ธฐ๋ณธ ์ปจํŠธ๋กค๋Ÿฌ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋ชจ๋“  ๊ฒŒ์ดํŠธ์›จ์ด๋Š” dex์˜ ์˜ํ–ฅ์„ ํ•จ๊ป˜ ๋ฐ›๊ฒŒ ๋œ๋‹ค. knative์˜ ๊ฒŒ์ดํŠธ์›จ์ด ์ •๋ณด๋„ ํ•œ ๋ฒˆ ํ™•์ธํ•ด๋ณด์ž.

kubectl get gateways.networking.istio.io -n knative-serving knative-local-gateway -o yaml
spec:
  selector:
    istio: ingressgateway

๋งˆ์ฐฌ๊ฐ€์ง€๋กœ ๊ธฐ๋ณธ ์ปจํŠธ๋กค๋Ÿฌ๋ฅผ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ๋Š” ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.

ํ•ด๊ฒฐ

์ด ์ธ์ฆ์„ ์šฐํšŒํ•˜๋Š” ๊ณผ์ •์ด ํ•„์š”ํ•˜๋‹ค. Envoy filter๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ฐพ๊ธด ํ–ˆ๋Š”๋ฐ, ๋ฒ„์ „์ด ๋‹ค๋ฅธ์ง€ ์ž˜ ์•ˆ๋œ๋‹ค. ์‹œ๋„ํ•ด๋ณด๊ณ  ์‹ถ๋‹ค๋ฉด ์•„๋ž˜ ์ฒ˜๋Ÿผ patch๋ฅผ ์ˆ˜์ •ํ•ด์•ผํ•  ์ˆ˜๋„ ์žˆ๋‹ค.

patch:
    operation: MERGE
    value:
    name: envoy.ext_authz_disabled
    typed_per_filter_config:
        envoy.ext_authz:
        "@type": [type.googleapis.com/envoy.extensions.filters.http.ext_authz.v3.ExtAuthzPerRoute](http://type.googleapis.com/envoy.extensions.filters.http.ext_authz.v3.ExtAuthzPerRoute)
        disabled: true

๊นƒํ—™ ์ด์Šˆ์—์„œ ์ฐพ์€ ๋‚ด์šฉ์œผ๋กœ ์‹œ๋„ํ•˜๋‹ˆ ํ•ด๊ฒฐ๋˜์—ˆ๋‹ค.

istio ๋ฌธ์„œ๋ฅผ ๋ณด๋ฉด External Authorization์ด๋ผ๋Š” ๋‚ด์šฉ์ด ์žˆ๋‹ค. ์šฐ๋ฆฌ๋Š” ์ด๋ฏธ dex๊ฐ€ ๋ฐฐํฌ๋˜์–ด์žˆ์œผ๋‹ˆ authorizer๋ฅผ ์ถ”๊ฐ€ ๋ฐฐํฌํ•ด์ค„ ํ•„์š”๋Š” ์—†๋‹ค. ์šฐ์„  auth๊ฐ€ ํ•„์š”ํ•œ ๋ถ€๋ถ„์„ configmap์— ๋ช…์‹œํ•ด์ฃผ์ž. ๋จผ์ € configmap์„ ์—ฐ๋‹ค.

kubectl edit configmap istio -n istio-system

๊ทธ๋ฆฌ๊ณ  dex ๊ด€๋ จํ•œ ์ •๋ณด๋ฅผ ์—ฌ๊ธฐ์— ์ถ”๊ฐ€ํ•ด์ค€๋‹ค.

extensionProviders:
    - name: dex-auth-provider
        envoyExtAuthzHttp:
        service: "authservice.istio-system.svc.cluster.local"
        port: "8080" 
        includeHeadersInCheck: ["authorization", "cookie", "x-auth-token"]
        headersToUpstreamOnAllow: ["kubeflow-userid"]

๊นƒํ—™ ์ด์Šˆ์—์„œ๋Š” kf๊ฐ€ ์‚ฌ์šฉํ•˜๋Š” ํ˜ธ์ŠคํŠธ๋งŒ์„ ๋”ฑ ๋ช…์‹œํ•ด์ฃผ๋Š”๋ฐ, ์ง€๊ธˆ ๊ตฌ์„ฑ์—์„œ๋Š” ๋”ฐ๋กœ ํ˜ธ์ŠคํŠธ๋ฅผ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ์ง€ ์•Š์•„์„œ ๊ทธ๋Ÿฐ๊ฐ€ ๊ทธ๋Œ€๋กœ ์‚ฌ์šฉํ•˜๋ฉด ์•ˆ๋œ๋‹ค. ๋”ฐ๋ผ์„œ kserve๊ฐ€ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฝ๋กœ๋ฅผ ์ œ์™ธํ•ด์ฃผ๋Š” ๋ฐฉ์‹์œผ๋กœ ์ ‘๊ทผํ•œ๋‹ค. ์•„๋ž˜ ์ •์ฑ…์„ ์ƒ์„ฑํ•œ๋‹ค.

apiVersion: [security.istio.io/v1beta1](http://security.istio.io/v1beta1)
kind: AuthorizationPolicy
metadata:
  name: dex-auth
  namespace: istio-system
spec:
  selector:
    matchLabels:
      istio: ingressgateway
  action: CUSTOM
  provider:
    # The provider name must match the extension provider defined in the mesh config.
    name: dex-auth-provider
  rules:
  # The rules specify when to trigger the external authorizer.
  - to:
    - operation:
        notPaths: ["/v1*"]

๊ทธ๋ฆฌ๊ณ  ๋‚˜์„œ ์›๋ž˜ ์กด์žฌํ•˜๋˜ authn-filter๋ฅผ ์‚ญ์ œํ•˜๊ณ  istiod๋ฅผ ์žฌ์‹œ์ž‘ํ•œ๋‹ค.

kubeclt delete -n istio-system envoyfilters.networking.istio.io authn-filter

kubectl rollout restart deployment/istiod -n istio-system

์ด์ œ ์•„๊นŒ ์—ฐ๊ฒฐํ•ด๋‘” ํŒŒ๋“œ์—์„œ ๋‹ค์‹œ ์š”์ฒญ์„ ๋ณด๋‚ด๋ณด๋ฉด ์ฝ”๋“œ 200๊ณผ ํ•จ๊ป˜ ์ •์ƒ์ ์œผ๋กœ ์‘๋‹ต์ด ๋‚˜์˜ค๋Š” ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.

์‚ฌ์‹ค ์ด ๋ฐฉ์‹์€ ์‚ฌ์šฉํ•  ๊ฒฝ๋กœ๋ฅผ ๊ทธ๋•Œ๋งˆ๋‹ค ์ถ”๊ฐ€ํ•ด์ฃผ์–ด์•ผ ํ•˜๋Š” ๋ฌธ์ œ๊ฐ€ ์žˆ๋‹ค. ๋‹ค๋งŒ ์•„์ง๊นŒ์ง€ fancyํ•˜๊ฒŒ kubeflow์—๋งŒ authorization์„ ์š”๊ตฌํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ฐพ์ง€ ๋ชปํ–ˆ๋‹ค. ์ถ”ํ›„ ๋” ์ข‹์€ ๋ฐฉ๋ฒ•์„ ์•Œ๊ฒŒ ๋˜๋ฉด ์—…๋ฐ์ดํŠธํ•  ์˜ˆ์ •์ด๋‹ค.

sungsoo commented 2 years ago

KServe Python Server

KServe's python server libraries implement a standardized library that is extended by model serving frameworks such as Scikit Learn, XGBoost and PyTorch. It encapsulates data plane API definitions and storage retrieval for models.

It provides many functionalities, including among others:

It supports the following storage providers:

sungsoo commented 2 years ago

KServe Client

Getting Started

KServe's python client interacts with KServe control plane APIs for executing operations on a remote KServe cluster, such as creating, patching and deleting of a InferenceService instance. See the Sample for Python SDK Client to get started.

Documentation for Client API

Class | Method | Description -- | -- | -- KServeClient | set_credentials | Set Credentials KServeClient | create | Create InferenceService KServeClient | get | Get or watch the specified InferenceService or all InferenceServices in the namespace KServeClient | patch | Patch the specified InferenceService KServeClient | replace | Replace the specified InferenceService KServeClient | delete | Delete the specified InferenceService KServeClient | wait_isvc_ready | Wait for the InferenceService to be ready KServeClient | is_isvc_ready | Check if the InferenceService is ready

KServe Client[](https://kserve.github.io/website/0.8/sdk_docs/sdk_doc/#kserve-client) [Getting Started](https://kserve.github.io/website/0.8/sdk_docs/sdk_doc/#getting-started) KServe's python client interacts with KServe control plane APIs for executing operations on a remote KServe cluster, such as creating, patching and deleting of a InferenceService instance. See the [Sample for Python SDK Client](https://kserve.github.io/website/0.8/sdk_docs/samples/kserve_sdk_v1beta1_sample.ipynb) to get started. Documentation for Client API[](https://kserve.github.io/website/0.8/sdk_docs/sdk_doc/#documentation-for-client-api) Class Method Description [KServeClient](https://kserve.github.io/website/0.8/sdk_docs/docs/KServeClient/) [set_credentials](https://kserve.github.io/website/0.8/sdk_docs/docs/KServeClient/#set_credentials) Set Credentials [KServeClient](https://kserve.github.io/website/0.8/sdk_docs/docs/KServeClient/) [create](https://kserve.github.io/website/0.8/sdk_docs/docs/KServeClient/#create) Create InferenceService [KServeClient](https://kserve.github.io/website/0.8/sdk_docs/docs/KServeClient/) [get](https://kserve.github.io/website/0.8/sdk_docs/docs/KServeClient/#get) Get or watch the specified InferenceService or all InferenceServices in the namespace [KServeClient](https://kserve.github.io/website/0.8/sdk_docs/docs/KServeClient/) [patch](https://kserve.github.io/website/0.8/sdk_docs/docs/KServeClient/#patch) Patch the specified InferenceService [KServeClient](https://kserve.github.io/website/0.8/sdk_docs/docs/KServeClient/) [replace](https://kserve.github.io/website/0.8/sdk_docs/docs/KServeClient/#replace) Replace the specified InferenceService [KServeClient](https://kserve.github.io/website/0.8/sdk_docs/docs/KServeClient/) [delete](https://kserve.github.io/website/0.8/sdk_docs/docs/KServeClient/#delete) Delete the specified InferenceService [KServeClient](https://kserve.github.io/website/0.8/sdk_docs/docs/KServeClient/) [wait_isvc_ready](https://kserve.github.io/website/0.8/sdk_docs/docs/KServeClient/#wait_isvc_ready) Wait for the InferenceService to be ready [KServeClient](https://kserve.github.io/website/0.8/sdk_docs/docs/KServeClient/) [is_isvc_ready](https://kserve.github.io/website/0.8/sdk_docs/docs/KServeClient/#is_isvc_ready) Check if the InferenceService is ready
sungsoo commented 2 years ago

KServe Installation and Example

kubectl apply -f https://github.com/kserve/kserve/releases/download/v0.8.0/kserve.yaml

๊ด€๋ จ ์„ค์น˜ ์‹คํŒจ ์‚ฌ๋ก€

sungsoo commented 2 years ago

KServe Installation Log

This document describes the log for KServe installation and testing.

Installation

(pytorch) โ•ญโ”€sungsoo@sungsoo-HP-Z840 ~
โ•ฐโ”€$ kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.8.0/cert-manager.yaml

(pytorch) โ•ญโ”€sungsoo@sungsoo-HP-Z840 ~
โ•ฐโ”€$ kubectl apply -f https://github.com/kserve/kserve/releases/download/v0.8.0/kserve.yaml

Check pod status of KServe controller

(pytorch) โ•ญโ”€sungsoo@sungsoo-HP-Z840 ~
โ•ฐโ”€$ kubectl get pods -A
NAMESPACE                       NAME                                               READY   STATUS    RESTARTS   AGE
kube-system                     coredns-7f9c69c78c-tgwrz                           1/1     Running   0          25h

์ค‘๊ฐ„ ์ƒ๋žต

cert-manager                    cert-manager-b4d6fd99b-m6l64                       1/1     Running   0          22m
cert-manager                    cert-manager-cainjector-74bfccdfdf-wp5t4           1/1     Running   0          22m
cert-manager                    cert-manager-webhook-65b766b5f8-s7lpj              1/1     Running   0          22m
kserve                          kserve-controller-manager-0                        2/2     Running   4          11m
(pytorch) โ•ญโ”€sungsoo@sungsoo-HP-Z840 ~
โ•ฐโ”€$ kubectl get pods -n kserve
NAME                          READY   STATUS    RESTARTS   AGE
kserve-controller-manager-0   2/2     Running   1          3m46s

KServe Inference Service Example

1. Create test InferenceService

The following YAML file(iris-sklearn.yaml) describes the inference service for sklearn-based iris.

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "sklearn-iris"
spec:
  predictor:
    sklearn:
      storageUri: "gs://kfserving-samples/models/sklearn/iris"
(pytorch) โ•ญโ”€sungsoo@sungsoo-HP-Z840 ~
โ•ฐโ”€$ kubectl apply -f iris-sklearn.yaml -n traindb-ml โ†ต
inferenceservice.serving.kserve.io/sklearn-iris created

2. Check InferenceService status.

(pytorch) โ•ญโ”€sungsoo@sungsoo-HP-Z840 ~
โ•ฐโ”€$ k get inferenceservices -A
NAMESPACE    NAME           URL   READY   PREV   LATEST   PREVROLLEDOUTREVISION   LATESTREADYREVISION   AGE
traindb-ml   sklearn-iris                                                                               108s
sungsoo commented 2 years ago

Knative and microk8s

Article Source


Install multipass

brew install multipass

Install hyperkit or qemu, do not use virtual box it doesn't allow access from the host network bridge by default.

For qemu install libvirt and set as default driver

brew install libvirt
sudo multipass set local.driver=qemu

For hyperkit install hyperkit and set as default driver

brew install hyperkit
sudo multipass set local.driver=hyperkit

Using multipass create a new ubuntu VM

Create a multipass vm with 3 CPU, 2 GB, and 8GB of disk

multipass launch -n knative -c 3 -m 2G -d 8G

Set the primary name to knative to avoid always typing the name of the vm

multipass set client.primary-name=knative

Login into the vm

multipass shell

Install [microk8s])(https://microk8s.io/docs/getting-started) or from github/ubuntu/microk8s

sudo snap install microk8s --classic

Join the group microk8s

sudo usermod -a -G microk8s $USER
sudo chown -f -R $USER ~/.kube

Logout to refresh groups

exit

Login into the vm again

multipass shell

Check status

microk8s status --wait-ready

Check access

microk8s kubectl get nodes

Set alias

alias kubectl='microk8s kubectl'
alias k='kubectl'

Enable dns

microk8s enable dns

Install Knative Serving from knative.dev TLDR;

kubectl apply -f https://github.com/knative/serving/releases/download/knative-v1.2.0/serving-crds.yaml
kubectl apply -f https://github.com/knative/serving/releases/download/knative-v1.2.0/serving-core.yaml

kubectl apply -f https://github.com/knative/net-kourier/releases/download/knative-v1.2.0/kourier.yaml
kubectl patch configmap/config-network \
  --namespace knative-serving \
  --type merge \
  --patch '{"data":{"ingress.class":"kourier.ingress.networking.knative.dev"}}'

Check the status of the knative network layer load balancer

kubectl --namespace kourier-system get service kourier

If the EXTERNAL-IP is in pending then you need a load balancer in your kubernetes cluster

You can use the metalb addon, with a small range of ip addresses, use ip a to inspect the ip address currently assign and assign IPs on the same subnet

microk8s enable metallb:192.168.205.250-192.168.205.254

Yes, I know this is a hack but allows me to access the cluster from the host macOS ๐Ÿ˜…

Check again

kubectl --namespace kourier-system get service kourier

Output should look like this

NAME      TYPE           CLUSTER-IP      EXTERNAL-IP      PORT(S)                      AGE
kourier   LoadBalancer   10.152.183.31   192.168.205.16   80:32564/TCP,443:32075/TCP   7m17s

Check knative is up

kubectl get pods -n knative-serving

Configure Knative DNS

kubectl apply -f https://github.com/knative/serving/releases/download/knative-v1.2.0/serving-default-domain.yaml

Install the kn CLI

sudo curl -o /usr/local/bin/kn -sL https://github.com/knative/client/releases/download/knative-v1.2.0/kn-linux-amd64
sudo chmod +x /usr/local/bin/kn

Copy the kubeconfig to $HOME/.kube/config

microk8s config > $HOME/.kube/config

Create your first knative service

kn service create nginx --image nginx --port 80

Get the url of your new service

kn service describe nginx -o url

Curl the url

curl $(kn service describe nginx -o url)

You sould see the nginx output

Thank you for using nginx.

List the pods for your service

kubectl get pods

After a minute your pod should be deleted automatically (ie scale to zero)

NAME                                      READY   STATUS        RESTARTS   AGE
nginx-00001-deployment-5c94d6d769-ssnc7   2/2     Terminating   0          83s

Access the url again

curl $(kn service describe nginx -o url)
sungsoo commented 2 years ago

Istio Installation

์‹œ๋„ 1

istoctl์„ ์ด์šฉํ•ด์„œ ๊ฐ„๋‹จํžˆ ์„ค์น˜๋ฅผ ์‹œ๋„ํ•ด ๋ด„

Istio ์„ค์น˜์‹œ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•œ๋‹ค.

(base) โ•ญโ”€sungsoo@sungsoo-HP-Z840 ~
โ•ฐโ”€$ istioctl install
This will install the Istio 1.14.1 default profile with ["Istio core" "Istiod" "Ingress gateways"] components into the cluster. Proceed? (y/N) y
โœ” Istio core installed
โœ” Istiod installed
โœ˜ Ingress gateways encountered an error: failed to wait for resource: resources not ready after 5m0s: timed out waiting for the condition
  Deployment/istio-system/istio-ingressgateway (containers with unready status: [istio-proxy])
- Pruning removed resources                                                                                                                                           Error: failed to install manifests: errors occurred during operation

์‹œ๋„ 2

microk8s.disable์„ ํ†ตํ•ด istio ์‚ญ์ œ ํ›„, istoctl๋กœ ์žฌ์„ค์น˜ํ•ด ๋ด„.

(pytorch) โ•ญโ”€sungsoo@sungsoo-HP-Z840 ~
โ•ฐโ”€$ microk8s.disable istio
Disabling Istio
Error from server (NotFound): namespaces "istio-system" not found

(pytorch) โ•ญโ”€sungsoo@sungsoo-HP-Z840 ~
โ•ฐโ”€$ istioctl install
This will install the Istio 1.14.1 default profile with ["Istio core" "Istiod" "Ingress gateways"] components into the cluster. Proceed? (y/N) y
โœ” Istio core installed
โœ” Istiod installed
โœ” Ingress gateways installed
โœ” Installation complete                                                                                                                                               Making this installation the default for injection and validation.

Thank you for installing Istio 1.14.  Please take a few minutes to tell us about your install/upgrade experience!  https://forms.gle/yEtCbt45FZ3VoDT5A

์‹œ๋„ 3

Istio Ingress gateway validation

์„ค์น˜ ์ œ๋Œ€๋กœ ๋˜์—ˆ๋Š”์ง€ ํ™•์ธํ•ด ๋ณด์ž.

โ€˜istio-systemโ€™ ๋„ค์ž„์ŠคํŽ˜์ด์Šค๋กœ istio ๊ฐ์ฒด๊ฐ€ ์ œ๋Œ€๋กœ ๋กœ๋”ฉ๋˜์—ˆ๋Š”์ง€ ํ™•์ธํ•ด ๋ด„

(pytorch) โ•ญโ”€sungsoo@sungsoo-HP-Z840 ~
โ•ฐโ”€$ kubectl get pods -n istio-system
NAME                                   READY   STATUS    RESTARTS   AGE
istiod-6d67d84bc7-dbzbk                1/1     Running   0          5m59s
istio-ingressgateway-778f44479-rq4j4   1/1     Running   0          5m51s
(pytorch) โ•ญโ”€sungsoo@sungsoo-HP-Z840 ~
โ•ฐโ”€$ kubectl get services -n istio-system
NAME                   TYPE           CLUSTER-IP       EXTERNAL-IP    PORT(S)                                      AGE
istiod                 ClusterIP      10.152.183.182   <none>         15010/TCP,15012/TCP,443/TCP,15014/TCP        6m18s
istio-ingressgateway   LoadBalancer   10.152.183.49    10.64.140.45   15021:31348/TCP,80:31128/TCP,443:32300/TCP   6m10s
sungsoo commented 2 years ago

Kubernetes: microk8s with multiple Istio ingress gateways

Article Source


microk8s has convenient out-of-the-box support for MetalLB and an NGINX ingress controller. But microk8s is also perfectly capable of handling Istio operators, gateways, and virtual services if you want the advanced policy, security, and observability offered by Istio.

In this article, we will install the Istio Operator, and allow it to create the Istio Ingress gateway service. We follow that up by creating an Istio Gateway in the default namespace, then create a Deployment and VirtualService projecting unto the Istio Gateway.

To exercise an even more advanced scenario, we will install both a primary and secondary Istio Ingress gateway, each tied to a different MetalLB IP address. This can emulate serving your public customers one set of services, and serving a different set of administrative applications to a private internal network for employees.

This article builds off my previous article where we built a microk8s cluster using Ansible. There are many steps required for Istio setup, so I have wrapped this up into Ansible roles.

Prerequisites

This article builds off my previous article where we built a microk8s cluster using Ansible. If you used Terraform as described to create the microk8s-1 host, you already have an additional 2 network interfaces on the master microk8-1 host (ens4=192.168.1.141 and ens5=192.168.1.142).

However, a microk8s cluster is not required. You can run the steps in this article on a single microk8s node. But you MUST have an additional two network interfaces and IP addresses on the same network as your host (e.g. 192.168.1.0/24) for the MetalLB endpoints.

Istio Playbook

From the previous article, your last step was running the playbook that deployed a microk8s cluster, playbook_microk8s.yml.

We need to build on top of that and install the Istio Operator, Istio ingress gateway Service, Istio Gateway, and test Virtual Service and Deployment. Run this playbook.

ansible-playbook playbook_metallb_primary_secondary_istio.yml

At the successful completion of this playbook run, you will have Istio installed, two Istio Ingress services, two Istio Gateways, and two independent versions of the sample helloworld deployment served up using different endpoints and certificates.

The playbook does TLS validation using curl as a success criteria. However, it is beneficial for learning to step through the objects created and then execute a smoke test of the TLS endpoints manually. The rest of this article is devoted to these manual validations.

MetalLB validation

View the MetalLB objects.

$ kubectl get all -n metallb-system

NAME READY STATUS RESTARTS AGE
pod/speaker-9xzlc 1/1 Running 0 64m
pod/speaker-dts5k 1/1 Running 0 64m
pod/speaker-r8kck 1/1 Running 0 64m
pod/controller-559b68bfd8-mtl2s 1/1 Running 0 64m

NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/speaker 3 3 3 3 3 beta.kubernetes.io/os=linux 64m

NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/controller 1/1 1 1 64m

NAME DESIRED CURRENT READY AGE
replicaset.apps/controller-559b68bfd8 1 1 1 64m

Show the MetalLB configmap with the IP used.

$ kubectl get configmap/config -n metallb-system -o yaml

apiVersion: v1
data:
  config: |
    address-pools:
    - name: default
      protocol: layer2
      addresses:
      - 192.168.1.141-192.168.1.142
kind: ConfigMap
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: ....
  creationTimestamp: "2021-07-31T10:07:56Z"
  name: config
  namespace: metallb-system
  resourceVersion: "38015"
  selfLink: /api/v1/namespaces/metallb-system/configmaps/config
  uid: 234ad41d-cfde-4bf5-990e-627f74744aad

Istio Operator validation

View the Istio Operator objects in the โ€˜istio-operatorโ€™ namespace.

$ kubectl get all -n istio-operator

NAME                                        READY   STATUS    RESTARTS   AGE
pod/istio-operator-1-9-7-5d47654878-jh5sr   1/1     Running   1          65m

NAME                           TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
service/istio-operator-1-9-7   ClusterIP   10.152.183.120           8383/TCP   65m

NAME                                   READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/istio-operator-1-9-7   1/1     1            1           65m

NAME                                              DESIRED   CURRENT   READY   AGE
replicaset.apps/istio-operator-1-9-7-5d47654878   1         1         1       65m

The Operator should be โ€˜Runningโ€™, now check the Istio Operator logs for errors.

$ kubectl logs --since=15m -n istio-operator $(kubectl get pods -n istio-operator -lname=istio-operator -o jsonpath="{.items[0].metadata.name}")

...

- Processing resources for Ingress gateways.
โœ” Ingress gateways installed

...

Istio Ingress gateway validation

View the Istio objects in the โ€˜istio-systemโ€™ namespace. These are objects that the Istio Operator has created.

$ kubectl get pods -n istio-system

NAME                                              READY   STATUS    RESTARTS   AGE
istiod-1-9-7-656bdccc78-rr8hf                     1/1     Running   0          95m
istio-ingressgateway-b9b6fb6d8-d8fbp              1/1     Running   0          94m
istio-ingressgateway-secondary-76db9f9f7b-2zkcl   1/1     Running   0          94m

$ kubectl get services -n istio-system

NAME                             TYPE           CLUSTER-IP       EXTERNAL-IP     PORT(S)                                                                      AGE
istiod-1-9-7                     ClusterIP      10.152.183.198             15010/TCP,15012/TCP,443/TCP,15014/TCP                                        95m
istio-ingressgateway             LoadBalancer   10.152.183.92    192.168.1.141   15021:31471/TCP,80:32600/TCP,443:32601/TCP,31400:32239/TCP,15443:30571/TCP   94m
istio-ingressgateway-secondary   LoadBalancer   10.152.183.29    192.168.1.142   15021:30982/TCP,80:32700/TCP,443:32701/TCP,31400:31575/TCP,15443:31114/TCP   94m

Notice we have purposely created two istio ingress gateways, one is for our primary access (such as public customer traffic), and the other is to mimic a secondary access (perhaps for employee-only management access).

In the services, you will see reference to our MetalLB IP endpoints which is how we will ultimately reach the services projected unto these gateways.

Service and Deployment validation

Istio has an example app called helloworld. Our Ansible created two independent deployments that could be projected unto the two Istio Gateways.

Letโ€™s validate these deployments by testing access to the pods and services, without any involvement by Istio.

Service=helloworld, Deployment=helloworld-v1
Service=helloworld2, Deployment=helloworld-v2

To reach the internal pod and service IP addresses, we need to be inside the cluster itself so we ssh into the master before running these commands:

ssh -i tf-libvirt/id_rsa ubuntu@192.168.122.210

Letโ€™s view the deployments, pods, and then services for these two independent applications.

$ kubectl get deployments -n default
NAME             READY   UP-TO-DATE   AVAILABLE   AGE
helloworld2-v2   1/1     1            1           112m
helloworld-v1    1/1     1            1           112m

$ kubectl get pods -n default -l 'app in (helloworld,helloworld2)'

NAME                              READY   STATUS    RESTARTS   AGE
helloworld2-v2-749cc8dc6d-6kbh7   2/2     Running   0          110m
helloworld-v1-776f57d5f6-4gvp7    2/2     Running   0          109m

$ kubectl get services -n default -l 'app in (helloworld,helloworld2)'
NAME          TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
helloworld2   ClusterIP   10.152.183.251           5000/TCP   113m
helloworld    ClusterIP   10.152.183.187           5000/TCP   113m

First, letโ€™s pull from the private pod IP directly.

# internal ip of primary pod
$ primaryPodIP=$(microk8s kubectl get pods -l app=helloworld -o=jsonpath="{.items[0].status.podIPs[0].ip}")

# internal IP of secondary pod
$ secondaryPodIP=$(microk8s kubectl get pods -l app=helloworld2 -o=jsonpath="{.items[0].status.podIPs[0].ip}")

# check pod using internal IP
$ curl http://${primaryPodIP}:5000/hello
Hello version: v1, instance: helloworld-v1-776f57d5f6-4gvp7

# check pod using internal IP
$ curl http://${secondaryPodIP}:5000/hello
Hello version: v2, instance: helloworld2-v2-749cc8dc6d-6kbh7

With internal pod IP proven out, move up to the Cluster IP defined at the Service level.

# IP of primary service
$ primaryServiceIP=$(microk8s kubectl get service/helloworld -o=jsonpath="{.spec.clusterIP}")

# IP of secondary service
$ secondaryServiceIP=$(microk8s kubectl get service/helloworld2 -o=jsonpath="{.spec.clusterIP}")

# check primary service
$ curl http://${primaryServiceIP}:5000/hello
Hello version: v1, instance: helloworld-v1-776f57d5f6-4gvp7

# check secondary service
$ curl http://${secondaryServiceIP}:5000/hello
Hello version: v2, instance: helloworld2-v2-749cc8dc6d-6kbh7

These validations proved out the pod and service independent of the Istio Gateway or VirtualService. Notice all these were using insecure HTTP on port 5000, because TLS is layered on top by Istio.

Exit the cluster ssh session before continuing.

exit

Validate TLS certs

The Ansible scripts created a custom CA and then key+certificates for โ€œmicrok8s.localโ€ and โ€œmicrok8s-secondary.localโ€. These are located in the /tmp directory of the microk8s-1 host.

These will be used by the Istio Gateway and VirtualService for secure TLS.

# show primary cert info
$openssl x509 -in /tmp/microk8s.local.crt -text -noout | grep -E "CN |DNS"
        Issuer: CN = myCA.local
        Subject: CN = microk8s.local
                DNS:microk8s.local, DNS:microk8s-alt.local

# show secondary cert info
$ openssl x509 -in /tmp/microk8s-secondary.local.crt -text -noout | grep -E "CN |DNS"
        Issuer: CN = myCA.local
        Subject: CN = microk8s-secondary.local
                DNS:microk8s-secondary.local

Validate Kubernetes TLS secrets

The keys and certificates will not be used by Istio unless they are loaded as Kubernetes secrets available to the Istio Gateway.

# primary tls secret for 'microk8s.local'
$ kubectl get -n default secret tls-credential
NAME             TYPE                DATA   AGE
tls-credential   kubernetes.io/tls   2      10h

# primary tls secret for 'microk8s-secondary.local'
$ kubectl get -n default secret tls-secondary-credential
NAME                       TYPE                DATA   AGE
tls-secondary-credential   kubernetes.io/tls   2      10h

# if needed, you can pull the actual certificate from the secret
# it requires a backslash escape for 'tls.crt'
$ kubectl get -n default secret tls-credential -o jsonpath="{.data.tls\.crt}"
 | base64 --decode

Validate Istio Gateway

The Istio Gateway object is the entity that uses the Kubernetes TLS secrets shown above.

$ kubectl get -n default gateway
NAME                               AGE
gateway-ingressgateway-secondary   3h2m
gateway-ingressgateway             3h2m

Digging into the details of the Gateway object, we can see the host name it will be processing as well as the kubernetes tls secret it is using.

# show primary gateway
$ kubectl get -n default gateway/gateway-ingressgateway -o jsonpath="{.spec.servers}" | jq
[
  {
    "hosts": [
      "microk8s.local",
      "microk8s-alt.local"
    ],
    "port": {
      "name": "http",
      "number": 80,
      "protocol": "HTTP"
    }
  },
  {
    "hosts": [
      "microk8s.local",
      "microk8s-alt.local"
    ],
    "port": {
      "name": "https",
      "number": 443,
      "protocol": "HTTPS"
    },
    "tls": {
      "credentialName": "tls-credential",
      "mode": "SIMPLE"
    }
  }
]

# show secondary gateway
$ kubectl get -n default gateway/gateway-ingressgateway-secondary -o jsonpath="{.spec.servers}" | jq
[
  {
    "hosts": [
      "microk8s-secondary.local"
    ],
    "port": {
      "name": "http-secondary",
      "number": 80,
      "protocol": "HTTP"
    }
  },
  {
    "hosts": [
      "microk8s-secondary.local"
    ],
    "port": {
      "name": "https-secondary",
      "number": 443,
      "protocol": "HTTPS"
    },
    "tls": {
      "credentialName": "tls-secondary-credential",
      "mode": "SIMPLE"
    }
  }
]

Notice the first Gateway uses the โ€˜tls-credentialโ€™ secret, while the second uses โ€˜tls-secondary-credentialโ€™.

Validate VirtualService

The bridge that creates the relationship between the purely Istio objects (istio-system/ingressgateway,default/Gateway) and the application objects (pod,deployment,service) is the VirtualService.

This VirtualService is how the application is projected unto a specific Istio Gateway.

$ kubectl get -n default virtualservice
NAME                                           GATEWAYS                               HOSTS                                     AGE
hello-v2-on-gateway-ingressgateway-secondary   ["gateway-ingressgateway-secondary"]   ["microk8s-secondary.local"]              3h14m
hello-v1-on-gateway-ingressgateway             ["gateway-ingressgateway"]             ["microk8s.local","microk8s-alt.local"]   3h14m

Digging down into the VirtualService, you can see it lists the applicationโ€™s route, port, path, the expected HTTP Host header, and Istio gateway to project unto.

# show primary VirtualService
$ kubectl get -n default virtualservice/hello-v1-on-gateway-ingressgateway -o jsonpath="{.spec}" | jq 
{ 
  "gateways": [
    "gateway-ingressgateway"
  ],
  "hosts": [
    "microk8s.local",
    "microk8s-alt.local"
  ],
  "http": [
    {
      "match": [
        {
          "uri": {
            "exact": "/hello"
          }
        }
      ],
      "route": [
        {
          "destination": {
            "host": "helloworld",
            "port": {
              "number": 5000
            }
          }
        }
      ]
    }
  ]
}

# show secondary VirtualService
$ kubectl get -n default virtualservice/hello-v2-on-gateway-ingressgateway-secondary -o jsonpath="{.spec}" | jq
{
  "gateways": [
    "gateway-ingressgateway-secondary"
  ],
  "hosts": [
    "microk8s-secondary.local"
  ],
  "http": [
    {
      "match": [
        {
          "uri": {
            "exact": "/hello"
          }
        }
      ],
      "route": [
        {
          "destination": {
            "host": "helloworld2",
            "port": {
              "number": 5000
            }
          }
        }
      ]
    }
  ]
}

Validate URL endpoints

With the validation of all the dependent objects complete, you can now run the ultimate test which is to run an HTTPS against the TLS secured endpoints.

The Gateway requires that the proper FQDN headers be sent by your browser, so it is not sufficient to do a GET against the MetalLB IP addresses. The ansible scripts should have already created entries in the local /etc/hosts file so we can use the FQDN.

# validate that /etc/hosts has entries for URL
$ grep '\.local' /etc/hosts
192.168.1.141   microk8s.local
192.168.1.142   microk8s-secondary.local

# test primary gateway
# we use '-k' because the CA cert has not been loaded at the OS level
$ curl -k https://microk8s.local/hello
Hello version: v1, instance: helloworld-v1-776f57d5f6-4gvp7

# test secondary gateway
$ curl -k https://microk8s-secondary.local/hello
Hello version: v2, instance: helloworld2-v2-749cc8dc6d-6kbh7
Notice from the /etc/hosts entries, we have entries corresponding the MetalLB endpoints.  The tie between the MetalLB IP addresses and the Istio ingress gateway objects was shown earlier, but for convenience is below.

# tie between MetalLB and Istio Ingress Gateways
$ kubectl get -n istio-system services
NAME                             TYPE           CLUSTER-IP       EXTERNAL-IP     PORT(S)                                                                      AGE
istiod-1-9-7                     ClusterIP      10.152.183.198             15010/TCP,15012/TCP,443/TCP,15014/TCP                                        3h30m
istio-ingressgateway             LoadBalancer   10.152.183.92    192.168.1.141   15021:31471/TCP,80:32600/TCP,443:32601/TCP,31400:32239/TCP,15443:30571/TCP   3h30m
istio-ingressgateway-secondary   LoadBalancer   10.152.183.29    192.168.1.142   15021:30982/TCP,80:32700/TCP,443:32701/TCP,31400:31575/TCP,15443:31114/TCP   3h30m

Validate URL endpoints remotely

These same request can be made from your host machine as well since the MetalLB endpoints are on the same network as your host (all our actions so far have been from inside the microk8s-1 host). But the Istio Gateway expects a proper HTTP Host header so you have several options:

Iโ€™ve provided a script that you can run from the host for validation:

./test-istio-endpoints.sh

Conclusion

Using this concept of multiple ingress, you can isolate traffic to different source networks, customers, and services.

nREFERENCES

sungsoo commented 2 years ago

Microk8s puts up its Istio and sails away

Article Source


Istio almost immediately strikes you as enterprise grade software. Not so much because of the complexity it introduces, but more because of the features it adds to your service mesh. Must-have features packaged together in a coherent framework:

Since microk8s positions itself as the local Kubernetes cluster developers prototype on, it is no surprise that deployment of Istio is made dead simple. Letโ€™s start with the microk8s deployment itself:

> sudo snap install microk8s --classic

Istio deployment available with:

> microk8s.enable istio

There is a single question that we need to respond to at this point. Do we want to enforce mutual TLS authentication among sidecars? Istio places a proxy to your services so as to take control over routing, security etc. If we know we have a mixed deployment with non-Istio and Istio enabled services we would rather not enforce mutual TLS:

> microk8s.enable istio
Enabling Istio
Enabling DNS
Applying manifest
service/kube-dns created
serviceaccount/kube-dns created
configmap/kube-dns created
deployment.extensions/kube-dns created
Restarting kubelet
DNS is enabled
Enforce mutual TLS authentication (https://bit.ly/2KB4j04) between sidecars? If unsure, choose N. (y/N): y

Believe it or not we are done, Istio v1.0 services are being set up, you can check the deployment progress with:

> watch microk8s.kubectl get all --all-namespaces

We have packaged istioctl in microk8s for your convenience:

> microk8s.istioctl get all --all-namespaces
NAME                          KIND                                      NAMESPACE      AGE
grafana-ports-mtls-disabled   Policy.authentication.istio.io.v1alpha1   istio-system   2m
DESTINATION-RULE NAME   HOST                                             SUBSETS   NAMESPACE      AGE
istio-policy            istio-policy.istio-system.svc.cluster.local                istio-system   3m
istio-telemetry         istio-telemetry.istio-system.svc.cluster.local             istio-system   3m
GATEWAY NAME                      HOSTS     NAMESPACE      AGE
istio-autogenerated-k8s-ingress   *         istio-system   3m

Do not get scared by the amount of services and deployments, everything is under the istio-system namespace. We are ready to start exploring!

Demo Time!

Istio needs to inject sidecars to the pods of your deployment. In microk8s auto-injection is supported so the only thing you have to label the namespace you will be using with istion-injection=enabled:

> microk8s.kubectl label namespace default istio-injection=enabled

Letโ€™s now grab the bookinfo example from the v1.0 Istio release and apply it:

> wget https://raw.githubusercontent.com/istio/istio/release-1.0/samples/bookinfo/platform/kube/bookinfo.yaml
> microk8s.kubectl create -f bookinfo.yaml

The following services should be available soon:

> microk8s.kubectl get svc
NAME          TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    details       ClusterIP   10.152.183.33    <none>        9080/TCP   kubernetes    ClusterIP   10.152.183.1     <none>        443/TCP    productpage   ClusterIP   10.152.183.59    <none>        9080/TCP   ratings       ClusterIP   10.152.183.124   <none>        9080/TCP   reviews       ClusterIP   10.152.183.9     <none>        9080/TCP

We can reach the services using the ClusterIP they have; we can for example get to the productpage in the above example by pointing our browser to 10.152.183.59:9080. But letโ€™s play by the rules and follow the official instructions on exposing the services via NodePort:

> wget https://raw.githubusercontent.com/istio/istio/release-1.0/samples/bookinfo/networking/bookinfo-gateway.yaml
> microk8s.kubectl create -f bookinfo-gateway.yaml

To get to the productpage through ingress we shamelessly copy the example instructions:

> microk8s.kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.spec.ports[?(@.name=="http2")].nodePort}'
31380

And our node is the localhost so we can point our browser to http://localhost:31380/productpage

Show me some graphs!

Of course graphs look nice in a blog post, so here you go.

The Grafana Service

You will need to grab the ClusterIP of the Grafana service:

microk8s.kubectl -n istio-system get svc grafana

Prometheus is also available in the same way.

microk8s.kubectl -n istio-system get svc prometheus

The Prometheus Service

And for traces you will need to look at the jaeger-query.

microk8s.kubectl -n istio-system get service/jaeger-query

The Jaeger Service

The servicegraph endpoint is available with:

microk8s.kubectl -n istio-system get svc servicegraph

The ServiceGraph

I should stop here. Go and checkout the Istio documentation for more details on how to take advantage of what Istio is offering.

What to keep from this post

References

sungsoo commented 2 years ago

KServe ์„ค์น˜ ํ™•์ธ

KServe Quick Start๋ฅผ ์ฐธ๊ณ (quick_install.sh)ํ•˜์—ฌ ์„ค์น˜ ํ›„, ์„ค์น˜๊ฐ€ ์ œ๋Œ€๋กœ ๋˜์—ˆ๋Š”์ง€ ํ™•์ธ

(pytorch) โ•ญโ”€sungsoo@sungsoo-HP-Z840 ~
โ•ฐโ”€$ k get pod -n kserve
NAME                          READY   STATUS    RESTARTS   AGE
kserve-controller-manager-0   2/2     Running   0          3d21h
sungsoo commented 2 years ago

Microk8s ๋‹ค๋ฃจ๊ธฐ

microk8s reset ํ•˜๊ธฐ

(pytorch) โ•ญโ”€sungsoo@sungsoo-HP-Z840 ~
โ•ฐโ”€$ microk8s reset
Disabling all addons.
Disabling addon : ambassador
Disabling addon : cilium
Disabling addon : dashboard
Disabling addon : dns
Disabling addon : fluentd
Disabling addon : gpu
Disabling addon : helm
Disabling addon : helm3
Disabling addon : host-access
Disabling addon : ingress
Disabling addon : istio
Disabling addon : jaeger
...
sungsoo commented 2 years ago

Serverless Installation Guide

KServe Serverless installation enables autoscaling based on request volume and supports scale down to and from zero. It also supports revision management and canary rollout based on revisions.

Kubernetes 1.20 is the minimally required version and please check the following recommended Knative, Istio versions for the corresponding Kubernetes version.

Kubernetes Version | Recommended Istio Version | Recommended Knative Version -- | -- | -- 1.20 | 1.9, 1.10, 1.11 | 0.25, 0.26, 1.0 1.21 | 1.10, 1.11 | 0.25, 0.26, 1.0 1.22 | 1.11, 1.12 | 0.25, 0.26, 1.0
sungsoo commented 2 years ago

KServe setup and testing (starting from 5 July)

Prerequests

  1. Microk8s with Kubeflow: you have an installed version of Kubeflow.
  2. Fundamental Concepts of Kubeflow, Istio, KNative, KServe(or formerly KFServing)
    • You need to understand the following core concepts related to model serving in Kubeflow.
    • Since we can't delve deeply into every topic, we would like to provide you a short list of our favorite primers on Kubeflow especially serving topics.

0. Installing Kubeflow

We assume that you have already installed Kubeflow by using the following guide.

1. KServe Installation

  1. Install Istio Please refer to the Istio install guide.

  2. Install Knative Serving Please refer to Knative Serving install guide.

Note If you are looking to use PodSpec fields such as nodeSelector, affinity or tolerations which are now supported in the v1beta1 API spec, you need to turn on the corresponding feature flags in your Knative configuration.

  1. Install Cert Manager The minimally required Cert Manager version is 1.3.0 and you can refer to Cert Manager.

Note Cert manager is required to provision webhook certs for production grade installation, alternatively you can run self signed certs generation script.

  1. Install KServe
kubectl apply -f https://github.com/kserve/kserve/releases/download/v0.8.0/kserve.yaml
  1. Install KServe Built-in ClusterServingRuntimes
kubectl apply -f https://github.com/kserve/kserve/releases/download/v0.8.0/kserve-runtimes.yaml

Note ClusterServingRuntimes are required to create InferenceService for built-in model serving runtimes with KServe v0.8.0 or higher.

sungsoo commented 1 year ago

์žฌ์„ค์น˜ Microk8s ์˜ค๋ฅ˜ ์ƒ๊ธธ ๋•Œ

microk8s๋ฅผ ์žฌ์„ค์น˜ํ•˜๊ณ  istio๋ฅผ ์„ค์น˜ํ•˜๋ ค๊ณ  ํ•  ๋•Œ, ์•„๋ž˜์™€ ๊ฐ™์€ ์˜ค๋ฅ˜๊ฐ€ ์ƒ๊ธด๋‹ค.

(base) โ•ญโ”€sungsoo@z840 ~/kubeflow/istio-1.11.0
โ•ฐโ”€$ bin/istioctl install
Error: fetch Kubernetes config file: Get "http://localhost:8080/api?timeout=32s": dial tcp 127.0.0.1:8080: connect: connection refused

์•„๋ž˜ ๋ช…๋ น์„ ์‹คํ–‰ํ•˜์—ฌ config ๋ฅผ ๊ฐฑ์‹ ํ•˜์ž.

(base) โ•ญโ”€sungsoo@z840 ~/kubeflow/istio-1.11.0
โ•ฐโ”€$ microk8s config > ~/.kube/config
yurkoff-mv commented 1 year ago

https://github.com/sungsoo/sungsoo.github.io/issues/21#issuecomment-1170553883

I wanted to bypass the Dex when accessing Inference Services from the outside. In my case it was necessary to deploy an additional policy, otherwise there was no access:

apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: allow-inference-services
  namespace: istio-system
spec:
  selector:
    matchLabels:
  action: ALLOW
  rules:
  - from:
    - source:
        principals: ["cluster.local/ns/istio-system/sa/istio-ingressgateway-service-account"]
  - to:
    - operation:
        methods: ["POST"]
        paths: ["/v1*"]

Also, these actions seem to lead to future crashes: https://github.com/kubeflow/manifests/issues/2309#issue-1434652347