canonical / microk8s

MicroK8s is a small, fast, single-package Kubernetes for datacenters and the edge.
https://microk8s.io
Apache License 2.0
8.38k stars 765 forks source link

Can finish install kubeflow #597

Closed ycheng closed 4 years ago

ycheng commented 5 years ago

OS: Ubuntu 18.04.3 LTS GPU: Quadro P400 x 2 Procedure: https://ubuntu.com/kubeflow/install

After run install-kubeflow.sh, can't finish kubectl apply --validate=false -f default.yaml, it failed at something like

https://pastebin.ubuntu.com/p/wGg6ZMkC66/

unable to recognize "default.yaml": no matches for kind "CompositeController" in version "metacontroller.k8s.io/v1alpha1" unable to recognize "default.yaml": no matches for kind "Application" in version "app.k8s.io/v1beta1"

Per my test, it can finish running once. I try to reset microk8s and wait for all kube-system pods running, but still can reproduce this.

The output of microk8s.inspect attached as follow:

inspection-report-20190815_061551.tar.gz

ktsakalozos commented 5 years ago

@carmine may be able to offer some insight into this issue. Two days ago I tested these scripts and they worked (but did not work for @ycheng). Today I see these scripts were updated to use v1.15 and the deployment is failing for me with:

+ ks show default -c metacontroller -c application
ERROR find objects: lib/ksonnet-lib/v1.15.2/k8s.libsonnet:17681:93-94 Unknown variable: x

          "withX-Kubernetes-Embedded-Resource":: self + { "x-kubernetes-embedded-resource": x-kubernetes-embedded-resource },

Not sure why. We should bring these scripts over at the MicroK8s repository under a microk8s.addon script so they go through our CI process as we release MicroK8s/updates.

ycheng commented 5 years ago

try again with 1.14 on another machine, and it works fine. I'll re-test on the original one (which have two gpu installed) latter when it's available.

For 1.15, also saw the same issue.

Per check, it has the same error as https://github.com/kubeflow/kubeflow/issues/3544.

ycheng commented 5 years ago

if use command KUBEFLOW_VERSION=0.6.1 ./install-kubeflow.sh with microk8s 1.15, it eventurally failed at

ycheng commented 5 years ago

Just manage to run kubeflow 0.6 on microk8s 1.15, key points

  1. enable istio, dns, storage and dashboard.
  2. kustomize version: 3.1.0 kfctl: 0.6.1 (after downloaded, it show version v0.6.1-rc.2-1-g3a37cbc6)
  3. I downloaded kfctl_k8s_istio.yaml and comment out two line
    # Istio install. If not needed, comment out istio-crds and istio-install.
    -  - istio-crds
    -  - istio-install
    +  #- istio-crds
    +  #- istio-install
  4. the path for config file need to be URL, so I use file:///home/...../kfctl_k8s.yaml
  5. the main dashboard run in k8s service "centraldashboard".
ycheng commented 5 years ago

per double checking, the dashboard for kubeflow 0.6 look good, but click into detail show nothing. Try to do it again today, some archive files chage from the server. It seems more time to converge is needed.

tvansteenburgh commented 5 years ago

/cc @knkski

carmine commented 5 years ago

We are re-validating the install steps for kubeflow 0.6.1 now and will update this ticket / the site accordingly.

ycheng commented 4 years ago

steps in https://ubuntu.com/kubeflow/install works, with two minor change.

  1. export VERSION='curl ...' is not working because the quote is not correct. Maybe change to export VERSION=$(curl ....)

  2. as "kfctl apply all -V", it complains namespace kubeflow-anonymous does not exist. Wait longer and it will pass.

    Ref: https://github.com/kubeflow/kubeflow/issues/4090

    1. Can only create notebook in namespace kubeflow-anonymous. Ref: kubeflow/kubeflow#4156
ktsakalozos commented 4 years ago

steps in https://ubuntu.com/kubeflow/install works, with two minor change

/cc @ammarn911

ammarn911 commented 4 years ago

thanks @ktsakalozos , I'll update the installation instructions.

kschroed commented 4 years ago

I've tried with the suggested two changes above from @ycheng but getting exit status of

Error: couldn't apply KfApp: (kubeflow.error): Code 500 with message: kfApp Apply failed for kustomize: (kubeflow.error): Code 500 with message: couldn't create resources from application Error: no matches for kind "Application" in version "app.k8s.io/v1beta1"

This is on microk8s 1.15.3 and kubeflow v0.6.2

ycheng commented 4 years ago

@kschroed I try again today and it still works. my output of snap list are

Name      Version  Rev   Tracking  Publisher   Notes
core      16-2.41  7713  stable    canonical✓  core
microk8s  v1.15.3  826   stable    canonical✓  classic

I also put all those steps into script in https://github.com/ycheng/microk8s-kubeflow-install. It's the same with those on the web page with minor enhancement. To use it:

$ ./microk8s-install.bash
# logout and login to get microk8s group.
$ ./kubeflow-init.sh
$ ./kubeflow-apply.sh

Sorry that I don't know why it failed on your side based on the log you provided.

kschroed commented 4 years ago

I ran through it again and it got past stage but on to hanging with no kubflow-anonymous namespace, which after I created manually it then completed and is functional with the exception of argo runtime working with containerd.

ycheng commented 4 years ago

@kschroed please checkhttps://github.com/kubeflow/kubeflow/issues/40900 for kubeflow-anonmous.

Per current status, I believe we can close this bug. Feel free to open another for other issues.