elotl / kip

Virtual-kubelet provider running pods in cloud instances
Apache License 2.0
223 stars 14 forks source link

Issue setting up kip with minikube #159

Closed joh4n closed 4 years ago

joh4n commented 4 years ago

Hi, I just came across kip and wanted to try it out with minikube. I have gone trough the installation instructions several times and I still can not get it to run correctly.

It looks like some pods and or nodes does not get created properly when staring kip.

First I add the aws credentials in deploy/manifests/kip/base/provider.yaml then.

minikube start
kustomize build deploy/manifests/kip/base | kubectl apply -f 

I get the output

serviceaccount/kip-network-agent created
serviceaccount/kip-provider created
clusterrole.rbac.authorization.k8s.io/kip-provider created
clusterrole.rbac.authorization.k8s.io/kip-network-agent created
clusterrolebinding.rbac.authorization.k8s.io/kip-provider created
clusterrolebinding.rbac.authorization.k8s.io/kip-network-agent created
configmap/kip-config-8gf89h865f created
secret/kip-network-agent created
service/kip-provider created
statefulset.apps/kip-provider created
persistentvolumeclaim/kip-provider-data created

I do not see any pods:

kubectl get pods
No resources found in default namespace.

or nodes related to kip

kubectl get nodes
NAME       STATUS   ROLES    AGE    VERSION
minikube   Ready    master   7m3s   v1.18.3

as mentioned in the readm: "After applying, you should see a new kip pod in the kube-system namespace and a new node named "kip-0" in the cluster."

and not unsuprisingly:

kubectl -nkube-system logs kip-0 -c kip -f
Error from server (NotFound): pods "kip-0" not found

When trying to deploy a basic ngnx service

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      nodeSelector:
        type: virtual-kubelet
      containers:
        - name: nginx
          image: nginx:1.14.2
          ports:
            - containerPort: 80

with kubectl apply -f nginx-deployment-virt-kub.yaml

the job get stuck as pending:

kubectl describe pod nginx-deployment-79cbb8c99-9xptz
Name:           nginx-deployment-79cbb8c99-9xptz
Namespace:      default
Priority:       0
Node:           <none>
Labels:         app=nginx
                pod-template-hash=79cbb8c99
Annotations:    <none>
Status:         Pending
IP:
IPs:            <none>
Controlled By:  ReplicaSet/nginx-deployment-79cbb8c99
Containers:
  nginx:
    Image:        nginx:1.14.2
    Port:         80/TCP
    Host Port:    0/TCP
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-vdmvx (ro)
Conditions:
  Type           Status
  PodScheduled   False
Volumes:
  default-token-vdmvx:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-vdmvx
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  type=virtual-kubelet
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason            Age               From               Message
  ----     ------            ----              ----               -------
  Warning  FailedScheduling  3s (x3 over 77s)  default-scheduler  0/1 nodes are available: 1 node(s) didn't match node selector.

versions

I tried this both on the latest master (hash 891adef0e0f0e552956254a3dbe2a9e01fac9aa6) and v0.0.17 and v0.0.15

kubectl version
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.5", GitCommit:"e6503f8d8f769ace2f338794c914a96fc335df0f", GitTreeState:"archive", BuildDate:"2020-07-01T16:28:46Z", GoVersion:"go1.14.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.3", GitCommit:"2e7996e3e2712684bc73f0dec0200d64eec7fe40", GitTreeState:"clean", BuildDate:"2020-05-20T12:43:34Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
minikube version
minikube version: v1.11.0
commit: 57e2f55f47effe9ce396cea42a1e0eb4f611ebbd
myechuri commented 4 years ago

Thanks a lot for trying kip, @joh4n !

First I add the aws credentials in deploy/manifests/kip/base/provider.yaml

If you want to use kip on minikube, kustomize scrips in https://github.com/elotl/kip/blob/master/deploy/manifests/kip/overlays/minikube/ are a better choice. Can you please retry with manifests in this directory? If you still do not see kip node, kubectl -n kube-system get pods should show *kip* pod. Can you please share kubectl logs from kip pod? Thanks!

justnoise commented 4 years ago

One other note: We updated the name of the kip pod to kip-provider-0 and missed updating the references in the README and troubleshooting doc. Apologies!

As @myechuri pointed out, you'll want to give the minikube manifests a shot. Minikube is a bit of a different setup for kip and you'll need to configure a couple additional items to make kip work (check out the instructions here: https://github.com/elotl/kip/blob/master/deploy/manifests/kip/overlays/minikube/kustomization.yaml#L1-L26):

  1. Credentials for starting instances in AWS since kip can't use an instance profile if its running on a laptop. The credentials can be specified using a secret or directly into provider.yaml (but a secret is preferred).
  2. You'll need to tell kip where it should launch instances (what VPC & subnet) in provider.yaml
  3. You'll need to create an additional security group that kip can use to connect to instances it creates. That security group should, at a minimum, open TCP port 6421 to traffic from your laptop. That security group needs to be manually specified in provider.yaml as well: https://github.com/elotl/kip/blob/master/deploy/manifests/kip/overlays/minikube/provider.yaml#L27-L29

Feel free to comment here if things are unclear or if you run into issues.

joh4n commented 4 years ago

I missed the minikube folder thx. However, I followed the instructions in: https://github.com/elotl/kip/blob/master/deploy/manifests/kip/overlays/minikube/kustomization.yaml#L1-L26

with provider.yaml

apiVersion: v1
cloud:
  aws:
    # You can also use environment variables for region, access and secret key.
    region: eu-central-1
    accessKeyID: ""
    secretAccessKey: ""
    vpcID: vpc-0eddxxxxxxxxxxx
    subnetID: subnet-0eexxxxxxxx
etcd:
  internal:
    dataDir: /opt/kip/data
cells:
  standbyCells:
  defaultInstanceType: t3.nano
  defaultVolumeSize: 15G
  bootImageSpec:
    owners: 689494258501
    filters: name=elotl-kip-*
  nametag: minikube
  itzo:
    url: https://itzo-kip-download.s3.amazonaws.com
    version: latest
# Optional, if kip needs to connect to cells via public IPs.
#  extraCIDRs:
#  - FILL_IN
 extraSecurityGroups:
 - sg-0bf4xxxxxxx
kubelet:
  cpu: "100"
  memory: "512Gi"
  pods: "200"

and kustomization.yaml

bases:
  - ../minikube
namespace: kube-system
configMapGenerator:
  - name: kip-config
    behavior: merge
    files:
      - provider.yaml
secretGenerator:
  - name: kip-secrets
    literals:
      - AWS_ACCESS_KEY_ID=AKIxxxxxxxxx
      - AWS_SECRET_ACCESS_KEY=1lXxxxxxxxxxxxxxxxxx

with that kustomiztion.yaml I get the error

Error: merging from generator &{0xc0007f6120 { } {{ kip-config merge {[] [provider.yaml] []} <nil>}}}: id resid.ResId{Gvk:resid.Gvk{Group:"", Version:"v1", Kind:"ConfigMap"}, Name:"kip-config", Namespace:""} does not exist; cannot merge or replace
error: no objects passed to apply

so I updated it with

.
.
configMapGenerator:
  - name: config
    behavior: merge
.
.
.

that runs but I get the error:

 kubectl -n kube-system logs kip-provider-0
error: a container name must be specified for pod kip-provider-0, choose one of: [kip kube-proxy] or one of the init containers: [init-cert]
myechuri commented 4 years ago

Hi @joh4n ,

kubectl -n kube-system logs kip-provider-0 error: a container name must be specified for pod kip-provider-0, choose one of: [kip kube-proxy] or one of the init containers: [init-cert]

Can you please run kubectl logs -n kube-system kip-provider-0 kip ? kip-provider-0 is a pod with kip container in it, and we would be interested in logs from kip container.

Also, can you please confirm if you see kip-provider-0 in the output of kubectl get nodes? Thanks.

myechuri commented 4 years ago

region: eu-central-1

@joh4n : kip trial is currently setup for us-east-1 region . Apologies for not calling this out in the readme, we will update readme. Can you please try us-east-1? Thanks.

joh4n commented 4 years ago

This is with us-east-1 as a region.

Screenshot from 2020-07-28 19-52-47 (ignore fatal: ref HEAD is not a symbolic ref it is a set up issue in my zsh when I check out a tag and I have been to lazy to fix it)

myechuri commented 4 years ago

@joh4n : let me repeat your steps with your your kustomization.yaml and see if i can reproduce your error. Will update by the end of the day.

myechuri commented 4 years ago

@joh4n : i reproduced your CreateContainerConfigError error in my local setup. The cause of failure is below:

  Warning  Failed            17s (x5 over 70s)  kubelet, m01       Error: secret "provider-secret" not found

This is because of minikube kustomize overlay scripts being out of sync with base. Let me fix that and share an update by the end of day PT wednesday.

myechuri commented 4 years ago

@joh4n : the issue is now fixed in master. There were two issues: 1) Instructions in the beginning of overlays/minikube/kustomization.yaml were outdated. This is what you followed and ran into failed apply.

2) We switched from listing AWS credentials in provider.yaml to specifying it via a secret in some of the deploy paths (minikube, provision your own cluster, burst kip workloads from on-prem cluster to AWS/GCP, etc). Some of the scripts in overlays/minikube assumed old format and some assumed new format. Apologies for this issue. i fixed overlays/minikube to use AWS credentials in provider.yaml by default and not rely on a secret.

Master now has fixes for both 1 and 2. https://github.com/elotl/kip/tree/master/deploy/manifests/kip/overlays/minikube/README.md has the latest instructions. i tested the latest instructions with latest bits. Please let me know how the latest bits work for you. Thank you for your patience!

justnoise commented 4 years ago

Also, we've distributed our images across all AWS regions so you can now run in us-central-1.

myechuri commented 4 years ago

@joh4n : checking in to see if you were able to make progress with your minikube env? Please let us know if there are any further issues. Thanks!