machine-learning-exchange / mlx

Machine Learning eXchange (MLX). Data and AI Assets Catalog and Execution Engine
https://ml-exchange.org/
Apache License 2.0
204 stars 54 forks source link

Local deployment on KIND needs update #318

Closed BrandonYifanLiu closed 2 years ago

BrandonYifanLiu commented 2 years ago

Describe the bug

When deploying MLX by following instructions here, hitting some errors in below format.

unable to recognize "STDIN": no matches for kind "XXXXX" in version "xxxx/xxxx"

To Reproduce

With following CLI versions.

brandonliu@Brandons-MacBook-Pro ~ % kind --version          
kind version 0.12.0
brandonliu@Brandons-MacBook-Pro ~ % kubectl version
Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.4", GitCommit:"e6c093d87ea4cbb530a7b2ae91e54c0842d8308a", GitTreeState:"clean", BuildDate:"2022-02-16T12:30:48Z", GoVersion:"go1.17.6", Compiler:"gc", Platform:"darwin/arm64"}
Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.4", GitCommit:"e6c093d87ea4cbb530a7b2ae91e54c0842d8308a", GitTreeState:"clean", BuildDate:"2022-03-06T21:39:59Z", GoVersion:"go1.17.7", Compiler:"gc", Platform:"linux/arm64"}
brandonliu@Brandons-MacBook-Pro ~ % kustomize version
{Version:kustomize/v4.5.2 GitCommit:9091919699baf1c5a5bf71b32ca73a993e98088b BuildDate:2022-02-09T23:19:28Z GoOs:darwin GoArch:arm64}

Steps to reproduce the behavior:

  1. Go to this link and follow instructions.
  2. Running command while ! kustomize build mlx-single-kind | \ kubectl apply -f -; do echo "Retrying to apply resources"; sleep 10; done.
  3. See error.
    unable to recognize "STDIN": no matches for kind "CustomResourceDefinition" in version "apiextensions.k8s.io/v1beta1"
    unable to recognize "STDIN": no matches for kind "CustomResourceDefinition" in version "apiextensions.k8s.io/v1beta1"
    unable to recognize "STDIN": no matches for kind "CustomResourceDefinition" in version "apiextensions.k8s.io/v1beta1"
    unable to recognize "STDIN": no matches for kind "CustomResourceDefinition" in version "apiextensions.k8s.io/v1beta1"
    unable to recognize "STDIN": no matches for kind "CustomResourceDefinition" in version "apiextensions.k8s.io/v1beta1"
    unable to recognize "STDIN": no matches for kind "CustomResourceDefinition" in version "apiextensions.k8s.io/v1beta1"
    unable to recognize "STDIN": no matches for kind "CustomResourceDefinition" in version "apiextensions.k8s.io/v1beta1"
    unable to recognize "STDIN": no matches for kind "CustomResourceDefinition" in version "apiextensions.k8s.io/v1beta1"
    unable to recognize "STDIN": no matches for kind "CustomResourceDefinition" in version "apiextensions.k8s.io/v1beta1"
    unable to recognize "STDIN": no matches for kind "CustomResourceDefinition" in version "apiextensions.k8s.io/v1beta1"
    unable to recognize "STDIN": no matches for kind "CustomResourceDefinition" in version "apiextensions.k8s.io/v1beta1"
    unable to recognize "STDIN": no matches for kind "CustomResourceDefinition" in version "apiextensions.k8s.io/v1beta1"
    unable to recognize "STDIN": no matches for kind "CustomResourceDefinition" in version "apiextensions.k8s.io/v1beta1"
    unable to recognize "STDIN": no matches for kind "CustomResourceDefinition" in version "apiextensions.k8s.io/v1beta1"
    unable to recognize "STDIN": no matches for kind "CustomResourceDefinition" in version "apiextensions.k8s.io/v1beta1"
    unable to recognize "STDIN": no matches for kind "RoleBinding" in version "rbac.authorization.k8s.io/v1beta1"
    unable to recognize "STDIN": no matches for kind "RoleBinding" in version "rbac.authorization.k8s.io/v1beta1"
    unable to recognize "STDIN": no matches for kind "ClusterRoleBinding" in version "rbac.authorization.k8s.io/v1beta1"
    unable to recognize "STDIN": no matches for kind "ClusterRoleBinding" in version "rbac.authorization.k8s.io/v1beta1"
    unable to recognize "STDIN": no matches for kind "ClusterRoleBinding" in version "rbac.authorization.k8s.io/v1beta1"
    unable to recognize "STDIN": no matches for kind "ClusterRoleBinding" in version "rbac.authorization.k8s.io/v1beta1"
    unable to recognize "STDIN": no matches for kind "ClusterRoleBinding" in version "rbac.authorization.k8s.io/v1beta1"
    unable to recognize "STDIN": no matches for kind "ClusterRoleBinding" in version "rbac.authorization.k8s.io/v1beta1"
    unable to recognize "STDIN": no matches for kind "EnvoyFilter" in version "networking.istio.io/v1alpha3"
    unable to recognize "STDIN": no matches for kind "EnvoyFilter" in version "networking.istio.io/v1alpha3"
    unable to recognize "STDIN": no matches for kind "EnvoyFilter" in version "networking.istio.io/v1alpha3"
    unable to recognize "STDIN": no matches for kind "EnvoyFilter" in version "networking.istio.io/v1alpha3"
    unable to recognize "STDIN": no matches for kind "EnvoyFilter" in version "networking.istio.io/v1alpha3"
    unable to recognize "STDIN": no matches for kind "EnvoyFilter" in version "networking.istio.io/v1alpha3"
    unable to recognize "STDIN": no matches for kind "EnvoyFilter" in version "networking.istio.io/v1alpha3"
    unable to recognize "STDIN": no matches for kind "EnvoyFilter" in version "networking.istio.io/v1alpha3"
    unable to recognize "STDIN": no matches for kind "EnvoyFilter" in version "networking.istio.io/v1alpha3"
    unable to recognize "STDIN": no matches for kind "Gateway" in version "networking.istio.io/v1alpha3"
    unable to recognize "STDIN": no matches for kind "Gateway" in version "networking.istio.io/v1alpha3"
    unable to recognize "STDIN": no matches for kind "VirtualService" in version "networking.istio.io/v1alpha3"
    unable to recognize "STDIN": no matches for kind "VirtualService" in version "networking.istio.io/v1alpha3"
    unable to recognize "STDIN": no matches for kind "VirtualService" in version "networking.istio.io/v1alpha3"
    unable to recognize "STDIN": no matches for kind "VirtualService" in version "networking.istio.io/v1alpha3"
    unable to recognize "STDIN": no matches for kind "AuthorizationPolicy" in version "security.istio.io/v1beta1"
    unable to recognize "STDIN": no matches for kind "AuthorizationPolicy" in version "security.istio.io/v1beta1"
    unable to recognize "STDIN": no matches for kind "MutatingWebhookConfiguration" in version "admissionregistration.k8s.io/v1beta1"
    unable to recognize "STDIN": no matches for kind "MutatingWebhookConfiguration" in version "admissionregistration.k8s.io/v1beta1"
    unable to recognize "STDIN": no matches for kind "ValidatingWebhookConfiguration" in version "admissionregistration.k8s.io/v1beta1"
    unable to recognize "STDIN": no matches for kind "ValidatingWebhookConfiguration" in version "admissionregistration.k8s.io/v1beta1"

Expected behavior

The command should return without error

Screenshots

If applicable, add screenshots to help explain your problem.

Environment

Additional context

Add any other context about the problem here.

BrandonYifanLiu commented 2 years ago

Checked k8s doc, it should relate to API deprecation. Link here https://kubernetes.io/docs/reference/using-api/deprecation-guide/

ckadner commented 2 years ago

Hi @BrandonYifanLiu thanks for reporting this. Did the deployment succeed eventually?

@Tomcli I think I have seen (some of) these error/warning messages before but the deployment seemed to succeed eventually

You had mentioned to

# run the below command two times if the CRDs take too long to provision

Is this connected?

BrandonYifanLiu commented 2 years ago

As a workaround discussed today, I will downgrade the version to make deployment succeed locally and document the process.

Tomcli commented 2 years ago

Hi @BrandonYifanLiu thanks for reporting this. Did the deployment succeed eventually?

@Tomcli I think I have seen (some of) these error/warning messages before but the deployment seemed to succeed eventually

You had mentioned to

# run the below command two times if the CRDs take too long to provision

Is this connected?

Yes, the CRDs take some time to deploy, so we might need to run it twice if we deploy MLX on a new cluster.

ckadner commented 2 years ago

As a workaround discussed today, I will downgrade the version to make deployment succeed locally and document the process.

Hi @BrandonYifanLiu -- could you create a PR to update the docs with the Kubernetes versions we need to use and how to do it with KIND?

ckadner commented 2 years ago

@rafvasq -- I think this is the issue you encountered today? Could you verify the Kubernetes version you were using? i.e. Brandon had 1.23.4.

Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.4", > GitCommit:"e6c093d87ea4cbb530a7b2ae91e54c0842d8308a", GitTreeState:"clean", BuildDate:"2022-02-16T12:30:48Z", GoVersion:"go1.17.6", Compiler:"gc", Platform:"darwin/arm64"}
Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.4", GitCommit:"e6c093d87ea4cbb530a7b2ae91e54c0842d8308a", GitTreeState:"clean", BuildDate:"2022-03-06T21:39:59Z", GoVersion:"go1.17.7", Compiler:"gc", Platform:"linux/arm64"}

Supported (required) Kubernetes should probably be no newer than 1.21.

Action items:

  1. Update KIND deployment doc:
    • mention Kubernetes high version boundary in Create KIND Cluster paragraph
    • describe and link to KIND docs how to create clusters for older Kubernetes versions
    • K8s versions should align with kindest/node tags
    • update the kind create cluster --name mlx code snippet to specify the K8s version v1.21 (kindest/node:v1.21.12)
    • kind create cluster ... --image kindest/node:v1.21.12
    • add Troubleshooting section with error message and pointer to use K8s version stated in prereqs
  2. Update Kubernetes deployment doc:
    • mention Kubernetes high version boundary in prereqs
    • add Troubleshooting section with error message and pointer to use K8s version stated in prereqs
  3. Update MLX deployment specs for Kubernetes 1.22 and 1.23
    • this will require help from @Tomcli and/or @yhwang
rafvasq commented 2 years ago

@ckadner, I've tested and continue to run into this issue with kubectl versions 1.20 and 1.21. Continuing to test but my latest attempt used the following:

ckadner commented 2 years ago

@rafvasq -- I just merged the PR from @kiranp2396 so the kind create cluster command should work as expected now.