vmware-archive / kubeless

Kubernetes Native Serverless Framework
https://kubeless.io
Apache License 2.0
6.86k stars 755 forks source link

Kubeless function overwrite containers resource in deployment template #1035

Closed Heronalps closed 4 years ago

Heronalps commented 5 years ago

Is this a BUG REPORT or FEATURE REQUEST?: BUG REPORT

What happened:

kubeless overwrite the deployment template (containers part) in kubeless config map

What you expected to happen:

My initial idea was to make functions share same volume. If deployment template propagate to function pod, it would be possible.

How to reproduce it (as minimally and precisely as possible):

Deployment json in kubeless config map:

deployment: |-
     {
        "spec": {    
          "template": {      
            "spec": {
              "containers": [{
                "resources": {
                  "requests": {
                    "nvidia.com/gpu": "1"              
                  },              
                  "limits": {
                    "nvidia.com/gpu": "1"  
                  }
                },
                "volumeMounts": [{
                  "mountPath": "/racelab",
                  "name": "fs-store"
                }]        
              }],
              "nodeSelector": {
                "gpu-type": "1080Ti"
              },
              "volumes": [{
                "name": "fs-store",
                "flexVolume": {            
                  "driver": "ceph.rook.io/rook",
                  "fsType": "ceph",
                  "options": {
                    "clusterNamespace": "rook",
                    "fsName": "nautilusfs",
                    "path": "/racelab",
                    "mountUser": "racelab",
                    "mountSecret": "ceph-fs-secret"
                  }
                }        
              }]      
            }
          }  
        }
      }
Volumes:
  fs-store:
    Type:       FlexVolume (a generic volume resource that is provisioned/attached using an exec based plugin)
    Driver:     ceph.rook.io/rook
    FSType:     ceph
    SecretRef:  nil
    ReadOnly:   false
    Options:    map[clusterNamespace:rook fsName:nautilusfs mountSecret:ceph-fs-secret mountUser:racelab path:/racelab]
Mounts:
      /kubeless from mnist-tf (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-f8rzs (ro)

In the pod, volume is found, but not mounted.

Anything else we need to know?:

Environment:

andresmgot commented 5 years ago

Hi @Heronalps,

You need to mount the volume in the runtime container, right? (The one that runs the function, not the one that install dependencies). In theory, the controller should be maintaining whatever exists in the deployment template:

https://github.com/kubeless/kubeless/blob/master/pkg/utils/kubelessutil.go#L609

Can you paste here the full deployment spec of the deployment generated for your function? (kubectl get deployment -o yaml <your-func>)

anaik-zam commented 4 years ago

Hi @andresmgot , I'm having the same issue, volume is added to the pod but not mounted in the runtime container. Deployment in kubeless-config configmap

deployment: |-
    {   "spec": {
        "template": {
            "spec": {
                "containers": [
                    {
                        "volumeMounts": [
                            {
                                "mountPath": "/mnt/data",
                                "name": "kubeless-storage"
                            }
                        ]
                    }
                ],
                "volumes": [
                    {
                        "name": "kubeless-storage",
                        "persistentVolumeClaim": {
                            "claimName": "kubeless-storage"
                        }
                    }
                ]
            }
        }
      }
    }

here is my function deployment

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "1"
  creationTimestamp: "2019-10-29T14:05:55Z"
  generation: 1
  labels:
    created-by: kubeless
    function: anaik-test
  name: anaik-test
  namespace: ops-serverless
  ownerReferences:
  - apiVersion: kubeless.io/v1beta1
    kind: Function
    name: anaik-test
    uid: 39585c81-fa55-11e9-8b5a-12d299f34b42
  resourceVersion: "28111117"
  selfLink: /apis/extensions/v1beta1/namespaces/ops-serverless/deployments/anaik-test
  uid: 395f65a5-fa55-11e9-8b5a-12d299f34b42
spec:
  progressDeadlineSeconds: 2147483647
  replicas: 1
  revisionHistoryLimit: 2147483647
  selector:
    matchLabels:
      created-by: kubeless
      function: anaik-test
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
    type: RollingUpdate
  template:
    metadata:
      annotations:
        prometheus.io/path: /metrics
        prometheus.io/port: "8080"
        prometheus.io/scrape: "true"
      creationTimestamp: null
      labels:
        created-by: kubeless
        function: anaik-test
    spec:
      containers:
      - env:
        - name: FUNC_HANDLER
          value: handle
        - name: MOD_NAME
          value: worker
        - name: FUNC_TIMEOUT
          value: "180"
        - name: FUNC_RUNTIME
          value: python3.7
        - name: FUNC_MEMORY_LIMIT
          value: "0"
        - name: FUNC_PORT
          value: "8080"
        - name: KUBELESS_INSTALL_VOLUME
          value: /kubeless
        - name: PYTHONPATH
          value: $(KUBELESS_INSTALL_VOLUME)/lib/python3.7/site-packages:$(KUBELESS_INSTALL_VOLUME)
        image: kubeless/python@sha256:dbf616cb06a262482c00f5b53e1de17571924032e0ad000865ec6b5357ff35bf
        imagePullPolicy: Always
        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /healthz
            port: 8080
            scheme: HTTP
          initialDelaySeconds: 3
          periodSeconds: 30
          successThreshold: 1
          timeoutSeconds: 1
        name: anaik-test
        ports:
        - containerPort: 8080
          protocol: TCP
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /kubeless
          name: anaik-test
      dnsPolicy: ClusterFirst
      initContainers:
      - args:
        - echo '7bde05ae320a405796d12654d8b6651f62c791b1d34e6d098015831a15ed0644  /src/worker.py'
          > /tmp/func.sha256 && sha256sum -c /tmp/func.sha256 && cp /src/worker.py
          /kubeless/worker.py && cp /src/requirements.txt /kubeless
        command:
        - sh
        - -c
        image: kubeless/unzip@sha256:f162c062973cca05459834de6ed14c039d45df8cdb76097f50b028a1621b3697
        imagePullPolicy: IfNotPresent
        name: prepare
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /kubeless
          name: anaik-test
        - mountPath: /src
          name: anaik-test-deps
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext:
        fsGroup: 1000
        runAsUser: 1000
      terminationGracePeriodSeconds: 30
      volumes:
      - name: kubeless-storage
        persistentVolumeClaim:
          claimName: kubeless-storage
      - emptyDir: {}
        name: anaik-test
      - configMap:
          defaultMode: 420
          name: anaik-test
        name: anaik-test-deps
status:
  availableReplicas: 1
  conditions:
  - lastTransitionTime: "2019-10-29T14:05:59Z"
    lastUpdateTime: "2019-10-29T14:05:59Z"
    message: Deployment has minimum availability.
    reason: MinimumReplicasAvailable
    status: "True"
    type: Available
  observedGeneration: 1
  readyReplicas: 1
  replicas: 1
  updatedReplicas: 1
andresmgot commented 4 years ago

mm, that's indeed weird. Seems like a bug. Apparently the volumeMounts from the deploymentSpec is empty instead of containing the one in the function:

https://github.com/kubeless/kubeless/blob/master/pkg/utils/kubelessutil.go#L618

I am not sure why at that point that list is empty.

anaik-zam commented 4 years ago

I think the configmap deployment container specs are not getting merged here https://github.com/kubeless/kubeless/blob/master/pkg/controller/function_controller.go#L310 https://github.com/kubeless/kubeless/blob/master/pkg/utils/k8sutil.go#L376 I'm not familiar with go but if you can suggest how to fix the issue I can try

andresmgot commented 4 years ago

Could be, we would need to investigate at which point the container loses the volumeMount (or if it has it at any point).

If you can help with that, what I would do is to just print the content of funcObj.Spec.Deployment.Template.Spec.Containers in different points until we figure out where it's missing.

If you modify some code, you can rebuild the image executing make function-controller in the root directory of this project but you need a working Golang environment. Then you can replace the image in your cluster with the one you have just built.

anaik-zam commented 4 years ago

I console logged destination & source objects before and after MergeDeployments. The volumes get merged but the VolumeMounts in the containers don't

Before utils.MergeDeployments::
funcObj.Spec.Deployment.Spec.Template.Spec.Containers[0].VolumeMounts:
[]
deployment.Spec.Template.Spec.Containers[0].VolumeMounts:
[{Name:kubeless-storage ReadOnly:false MountPath:/mnt/data SubPath: MountPropagation:<nil>}]

funcObj.Spec.Deployment.Spec.Template.Spec.Volumes:
[]
deployment.Spec.Template.Spec.Volumes:
[{Name:kubeless-storage VolumeSource:{HostPath:nil EmptyDir:&EmptyDirVolumeSource{Medium:,SizeLimit:<nil>,} GCEPersistentDisk:nil AWSElasticBlockStore:nil GitRepo:nil Secret:nil NFS:nil ISCSI:nil Glusterfs:nil PersistentVolumeClaim:nil RBD:nil FlexVolume:nil Cinder:nil CephFS:nil Flocker:nil DownwardAPI:nil FC:nil AzureFile:nil ConfigMap:nil VsphereVolume:nil Quobyte:nil AzureDisk:nil PhotonPersistentDisk:nil Projected:nil PortworxVolume:nil ScaleIO:nil StorageOS:nil}}]

After utils.MergeDeployments::
funcObj.Spec.Deployment.Spec.Template.Spec.Containers[0].VolumeMounts:
[]
deployment.Spec.Template.Spec.Containers[0].VolumeMounts:
[{Name:kubeless-storage ReadOnly:false MountPath:/mnt/data SubPath: MountPropagation:<nil>}]

funcObj.Spec.Deployment.Spec.Template.Spec.Volumes:
[{Name:kubeless-storage VolumeSource:{HostPath:nil EmptyDir:&EmptyDirVolumeSource{Medium:,SizeLimit:<nil>,} GCEPersistentDisk:nil AWSElasticBlockStore:nil GitRepo:nil Secret:nil NFS:nil ISCSI:nil Glusterfs:nil PersistentVolumeClaim:nil RBD:nil FlexVolume:nil Cinder:nil CephFS:nil Flocker:nil DownwardAPI:nil FC:nil AzureFile:nil ConfigMap:nil VsphereVolume:nil Quobyte:nil AzureDisk:nil PhotonPersistentDisk:nil Projected:nil PortworxVolume:nil ScaleIO:nil StorageOS:nil}}]
deployment.Spec.Template.Spec.Volumes:
[{Name:kubeless-storage VolumeSource:{HostPath:nil EmptyDir:&EmptyDirVolumeSource{Medium:,SizeLimit:<nil>,} GCEPersistentDisk:nil AWSElasticBlockStore:nil GitRepo:nil Secret:nil NFS:nil ISCSI:nil Glusterfs:nil PersistentVolumeClaim:nil RBD:nil FlexVolume:nil Cinder:nil CephFS:nil Flocker:nil DownwardAPI:nil FC:nil AzureFile:nil ConfigMap:nil VsphereVolume:nil Quobyte:nil AzureDisk:nil PhotonPersistentDisk:nil Projected:nil PortworxVolume:nil ScaleIO:nil StorageOS:nil}}]
andresmgot commented 4 years ago

good catch! I have no clue why that's behaving that way but at least we can workaround it. Can you add a small piece of code that appends deployment.Spec.Template.Spec.Containers[0].VolumeMounts to funcObj.Spec.Deployment.Spec.Template.Spec.Containers[0].VolumeMounts? Something like:

https://github.com/kubeless/kubeless/blob/master/pkg/controller/function_controller.go#L333

Then you can check if that solves your issue (and send a PR to fix it :) )

anaik-zam commented 4 years ago

Will check if that works, is there a test case that needs to be updated?

andresmgot commented 4 years ago

there is a TestMergeDeployments that you can update to cover this (but it's not mandatory)

anaik-zam commented 4 years ago

here is the pr https://github.com/kubeless/kubeless/pull/1093

andresmgot commented 4 years ago

The fix is included in the latest release, closing this issue