datashim-io / datashim

A kubernetes based framework for hassle free handling of datasets
http://datashim-io.github.io/datashim
Apache License 2.0
481 stars 68 forks source link

Transport endpoint is not connected when Using pvc as docker volume #303

Open morteza1131 opened 1 year ago

morteza1131 commented 1 year ago

I used created pvc as docker volume in /var/lib/docker I got the following error:

INFO[2023-08-20T12:06:35.408468700Z] [core] [Channel #1] Channel switches to new LB policy "pick_first"  module=grpc
INFO[2023-08-20T12:06:35.408662504Z] [core] [Channel #1 SubChannel #2] Subchannel created  module=grpc
INFO[2023-08-20T12:06:35.408781477Z] [core] [Channel #1 SubChannel #2] Subchannel Connectivity change to CONNECTING  module=grpc
INFO[2023-08-20T12:06:35.408860655Z] [core] [Channel #1 SubChannel #2] Subchannel picks a new address "/var/run/docker/containerd/containerd.sock" to connect  module=grpc
INFO[2023-08-20T12:06:35.409224219Z] [core] [Channel #1] Channel Connectivity change to CONNECTING  module=grpc
INFO[2023-08-20T12:06:35.621741194Z] starting containerd                           revision=1e1ea6e986c6c86565bc33d52e34b81b3e2bc71f version=v1.6.19
INFO[2023-08-20T12:06:35.644954961Z] loading plugin "io.containerd.content.v1.content"...  type=io.containerd.content.v1
INFO[2023-08-20T12:06:35.686415897Z] loading plugin "io.containerd.snapshotter.v1.aufs"...  type=io.containerd.snapshotter.v1
INFO[2023-08-20T12:06:35.748865266Z] loading plugin "io.containerd.snapshotter.v1.devmapper"...  type=io.containerd.snapshotter.v1
WARN[2023-08-20T12:06:35.749017062Z] failed to load plugin io.containerd.snapshotter.v1.devmapper  error="devmapper not configured"
INFO[2023-08-20T12:06:35.749075748Z] loading plugin "io.containerd.snapshotter.v1.native"...  type=io.containerd.snapshotter.v1
INFO[2023-08-20T12:06:35.811462959Z] loading plugin "io.containerd.snapshotter.v1.overlayfs"...  type=io.containerd.snapshotter.v1
INFO[2023-08-20T12:06:35.869175776Z] loading plugin "io.containerd.snapshotter.v1.zfs"...  type=io.containerd.snapshotter.v1
INFO[2023-08-20T12:06:35.870005609Z] skip loading plugin "io.containerd.snapshotter.v1.zfs"...  error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.zfs must be a zfs filesystem to be used with the zfs snapshotter: skip plugin" type=io.containerd.snapshotter.v1
INFO[2023-08-20T12:06:35.870073808Z] loading plugin "io.containerd.metadata.v1.bolt"...  type=io.containerd.metadata.v1
WARN[2023-08-20T12:06:35.885743206Z] could not use snapshotter devmapper in metadata plugin  error="devmapper not configured"
INFO[2023-08-20T12:06:35.885844312Z] metadata content store policy set             policy=shared
WARN[2023-08-20T12:06:35.908747115Z] failed to load plugin io.containerd.metadata.v1.bolt  error="write /var/lib/docker/containerd/daemon/io.containerd.metadata.v1.bolt/meta.db: operation not supported"
INFO[2023-08-20T12:06:35.908830606Z] loading plugin "io.containerd.differ.v1.walking"...  type=io.containerd.differ.v1
WARN[2023-08-20T12:06:35.908864872Z] failed to load plugin io.containerd.differ.v1.walking  error="write /var/lib/docker/containerd/daemon/io.containerd.metadata.v1.bolt/meta.db: operation not supported"
INFO[2023-08-20T12:06:35.908901710Z] loading plugin "io.containerd.event.v1.exchange"...  type=io.containerd.event.v1
INFO[2023-08-20T12:06:35.908962186Z] loading plugin "io.containerd.gc.v1.scheduler"...  type=io.containerd.gc.v1
WARN[2023-08-20T12:06:35.909000433Z] failed to load plugin io.containerd.gc.v1.scheduler  error="write /var/lib/docker/containerd/daemon/io.containerd.metadata.v1.bolt/meta.db: operation not supported"
INFO[2023-08-20T12:06:35.909032888Z] loading plugin "io.containerd.service.v1.introspection-service"...  type=io.containerd.service.v1
INFO[2023-08-20T12:06:35.909154319Z] loading plugin "io.containerd.service.v1.containers-service"...  type=io.containerd.service.v1
WARN[2023-08-20T12:06:35.909189608Z] failed to load plugin io.containerd.service.v1.containers-service  error="write /var/lib/docker/containerd/daemon/io.containerd.metadata.v1.bolt/meta.db: operation not supported"
INFO[2023-08-20T12:06:35.909222349Z] loading plugin "io.containerd.service.v1.content-service"...  type=io.containerd.service.v1
WARN[2023-08-20T12:06:35.909252347Z] failed to load plugin io.containerd.service.v1.content-service  error="write /var/lib/docker/containerd/daemon/io.containerd.metadata.v1.bolt/meta.db: operation not supported"
INFO[2023-08-20T12:06:35.909283231Z] loading plugin "io.containerd.service.v1.diff-service"...  type=io.containerd.service.v1
WARN[2023-08-20T12:06:35.909321772Z] failed to load plugin io.containerd.service.v1.diff-service  error="could not load required differ due plugin init error: walking: write /var/lib/docker/containerd/daemon/io.containerd.metadata.v1.bolt/meta.db: operation not supported"
INFO[2023-08-20T12:06:35.909354747Z] loading plugin "io.containerd.service.v1.images-service"...  type=io.containerd.service.v1
WARN[2023-08-20T12:06:35.909385786Z] failed to load plugin io.containerd.service.v1.images-service  error="write /var/lib/docker/containerd/daemon/io.containerd.metadata.v1.bolt/meta.db: operation not supported"
INFO[2023-08-20T12:06:35.909415486Z] loading plugin "io.containerd.service.v1.leases-service"...  type=io.containerd.service.v1
WARN[2023-08-20T12:06:35.909456342Z] failed to load plugin io.containerd.service.v1.leases-service  error="write /var/lib/docker/containerd/daemon/io.containerd.metadata.v1.bolt/meta.db: operation not supported"
INFO[2023-08-20T12:06:35.909513879Z] loading plugin "io.containerd.service.v1.namespaces-service"...  type=io.containerd.service.v1
WARN[2023-08-20T12:06:35.909540996Z] failed to load plugin io.containerd.service.v1.namespaces-service  error="write /var/lib/docker/containerd/daemon/io.containerd.metadata.v1.bolt/meta.db: operation not supported"
INFO[2023-08-20T12:06:35.909569363Z] loading plugin "io.containerd.service.v1.snapshots-service"...  type=io.containerd.service.v1
WARN[2023-08-20T12:06:35.909595688Z] failed to load plugin io.containerd.service.v1.snapshots-service  error="write /var/lib/docker/containerd/daemon/io.containerd.metadata.v1.bolt/meta.db: operation not supported"
INFO[2023-08-20T12:06:35.909625823Z] loading plugin "io.containerd.runtime.v1.linux"...  type=io.containerd.runtime.v1
WARN[2023-08-20T12:06:35.916373565Z] failed to load plugin io.containerd.runtime.v1.linux  error="mkdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux: transport endpoint is not connected"
INFO[2023-08-20T12:06:35.916455672Z] loading plugin "io.containerd.runtime.v2.task"...  type=io.containerd.runtime.v2
WARN[2023-08-20T12:06:35.916527584Z] failed to load plugin io.containerd.runtime.v2.task  error="write /var/lib/docker/containerd/daemon/io.containerd.metadata.v1.bolt/meta.db: operation not supported"
INFO[2023-08-20T12:06:35.916549427Z] loading plugin "io.containerd.monitor.v1.cgroups"...  type=io.containerd.monitor.v1
INFO[2023-08-20T12:06:35.917368843Z] loading plugin "io.containerd.service.v1.tasks-service"...  type=io.containerd.service.v1
WARN[2023-08-20T12:06:35.917418238Z] could not load runtime instance due to initialization error  error="mkdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux: transport endpoint is not connected"
WARN[2023-08-20T12:06:35.917467374Z] failed to load plugin io.containerd.service.v1.tasks-service  error="no runtimes available to create task service"
INFO[2023-08-20T12:06:35.917492944Z] loading plugin "io.containerd.grpc.v1.introspection"...  type=io.containerd.grpc.v1
INFO[2023-08-20T12:06:35.918021655Z] loading plugin "io.containerd.internal.v1.restart"...  type=io.containerd.internal.v1
WARN[2023-08-20T12:06:35.918083920Z] failed to load plugin io.containerd.internal.v1.restart  error="failed to get instance of service \"diff-service\": could not load required differ due plugin init error: walking: write /var/lib/docker/containerd/daemon/io.containerd.metadata.v1.bolt/meta.db: operation not supported"
INFO[2023-08-20T12:06:35.918122344Z] loading plugin "io.containerd.grpc.v1.containers"...  type=io.containerd.grpc.v1
WARN[2023-08-20T12:06:35.918138364Z] failed to load plugin io.containerd.grpc.v1.containers  error="write /var/lib/docker/containerd/daemon/io.containerd.metadata.v1.bolt/meta.db: operation not supported"
INFO[2023-08-20T12:06:35.918159600Z] loading plugin "io.containerd.grpc.v1.content"...  type=io.containerd.grpc.v1
WARN[2023-08-20T12:06:35.918179307Z] failed to load plugin io.containerd.grpc.v1.content  error="write /var/lib/docker/containerd/daemon/io.containerd.metadata.v1.bolt/meta.db: operation not supported"
INFO[2023-08-20T12:06:35.918203687Z] loading plugin "io.containerd.grpc.v1.diff"...  type=io.containerd.grpc.v1
WARN[2023-08-20T12:06:35.918224084Z] failed to load plugin io.containerd.grpc.v1.diff  error="could not load required differ due plugin init error: walking: write /var/lib/docker/containerd/daemon/io.containerd.metadata.v1.bolt/meta.db: operation not supported"
INFO[2023-08-20T12:06:35.918246623Z] loading plugin "io.containerd.grpc.v1.events"...  type=io.containerd.grpc.v1
INFO[2023-08-20T12:06:35.918270109Z] loading plugin "io.containerd.grpc.v1.healthcheck"...  type=io.containerd.grpc.v1
INFO[2023-08-20T12:06:35.918313598Z] loading plugin "io.containerd.grpc.v1.images"...  type=io.containerd.grpc.v1
WARN[2023-08-20T12:06:35.918340094Z] failed to load plugin io.containerd.grpc.v1.images  error="write /var/lib/docker/containerd/daemon/io.containerd.metadata.v1.bolt/meta.db: operation not supported"
INFO[2023-08-20T12:06:35.918367022Z] loading plugin "io.containerd.grpc.v1.leases"...  type=io.containerd.grpc.v1
WARN[2023-08-20T12:06:35.918387991Z] failed to load plugin io.containerd.grpc.v1.leases  error="write /var/lib/docker/containerd/daemon/io.containerd.metadata.v1.bolt/meta.db: operation not supported"
INFO[2023-08-20T12:06:35.918411946Z] loading plugin "io.containerd.grpc.v1.namespaces"...  type=io.containerd.grpc.v1
WARN[2023-08-20T12:06:35.918435562Z] failed to load plugin io.containerd.grpc.v1.namespaces  error="write /var/lib/docker/containerd/daemon/io.containerd.metadata.v1.bolt/meta.db: operation not supported"
INFO[2023-08-20T12:06:35.918456848Z] loading plugin "io.containerd.internal.v1.opt"...  type=io.containerd.internal.v1
INFO[2023-08-20T12:06:35.919034124Z] loading plugin "io.containerd.grpc.v1.snapshots"...  type=io.containerd.grpc.v1
WARN[2023-08-20T12:06:35.919075540Z] failed to load plugin io.containerd.grpc.v1.snapshots  error="write /var/lib/docker/containerd/daemon/io.containerd.metadata.v1.bolt/meta.db: operation not supported"
INFO[2023-08-20T12:06:35.919118144Z] loading plugin "io.containerd.grpc.v1.tasks"...  type=io.containerd.grpc.v1
WARN[2023-08-20T12:06:35.919150980Z] failed to load plugin io.containerd.grpc.v1.tasks  error="no runtimes available to create task service"
INFO[2023-08-20T12:06:35.919170679Z] loading plugin "io.containerd.grpc.v1.version"...  type=io.containerd.grpc.v1
INFO[2023-08-20T12:06:35.919194335Z] loading plugin "io.containerd.tracing.processor.v1.otlp"...  type=io.containerd.tracing.processor.v1
INFO[2023-08-20T12:06:35.919221567Z] skip loading plugin "io.containerd.tracing.processor.v1.otlp"...  error="no OpenTelemetry endpoint: skip plugin" type=io.containerd.tracing.processor.v1
INFO[2023-08-20T12:06:35.919244159Z] loading plugin "io.containerd.internal.v1.tracing"...  type=io.containerd.internal.v1
ERRO[2023-08-20T12:06:35.919345841Z] failed to initialize a tracing processor "otlp"  error="no OpenTelemetry endpoint: skip plugin"
INFO[2023-08-20T12:06:35.919969056Z] serving...                                    address=/var/run/docker/containerd/containerd-debug.sock
INFO[2023-08-20T12:06:35.920098863Z] serving...                                    address=/var/run/docker/containerd/containerd.sock.ttrpc
INFO[2023-08-20T12:06:35.920216177Z] serving...                                    address=/var/run/docker/containerd/containerd.sock
INFO[2023-08-20T12:06:35.920289755Z] containerd successfully booted in 0.403657s  
INFO[2023-08-20T12:06:35.941299609Z] [core] [Channel #1 SubChannel #2] Subchannel Connectivity change to READY  module=grpc
INFO[2023-08-20T12:06:35.941441779Z] [core] [Channel #1] Channel Connectivity change to READY  module=grpc
WARN[2023-08-20T12:06:35.946491923Z] failed to rename /var/lib/docker/tmp for background deletion: rename /var/lib/docker/tmp /var/lib/docker/tmp-old: transport endpoint is not connected. Deleting synchronously 
WARN[2023-08-20T12:06:35.946562469Z] failed to delete old tmp directory: /var/lib/docker/tmp 
INFO[2023-08-20T12:06:35.946771259Z] stopping healthcheck following graceful shutdown  module=libcontainerd
INFO[2023-08-20T12:06:35.946831860Z] [core] [Channel #1] Channel Connectivity change to SHUTDOWN  module=grpc
INFO[2023-08-20T12:06:35.946866957Z] [core] [Channel #1 SubChannel #2] Subchannel Connectivity change to SHUTDOWN  module=grpc
INFO[2023-08-20T12:06:35.946889470Z] [core] [Channel #1 SubChannel #2] Subchannel deleted  module=grpc
INFO[2023-08-20T12:06:35.946924755Z] [core] [Channel #1] Channel deleted           module=grpc
failed to start daemon: Unable to get the TempDir under /var/lib/docker: mkdir /var/lib/docker/tmp: transport endpoint is not connected
srikumar003 commented 1 year ago

Hi, I haven't tried this mode of using datashim but the error failed to start daemon: Unable to get the TempDir under /var/lib/docker: mkdir /var/lib/docker/tmp: transport endpoint is not connected seems to indicate a network issue. Please check if you have specified endpoint url correctly

morteza1131 commented 1 year ago

I used the PV in simple Nginx pod and its working fine, but when using it as docker root volume , daemon does not start.

srikumar003 commented 1 year ago

Could you post here how you configured datashim/pod ? Also, if you could explain why you are doing it, since so far we have not tried using the PVC in this manner.

You could also try overriding the mountPath in the CSI-S3 manifests: https://github.com/datashim-io/datashim/blob/8ec79a3aa25334c287dd2c877a2e3765c441f6ca/src/csi-s3/chart/templates/csi-s3.yaml#L140C30-L140C45 and see if that works (re: #160)

morteza1131 commented 1 year ago

I want to use s3 volume as my pod volume to use s3 disk instead of using local disk in my kubernetes environment. I use this:

apiVersion: com.ie.ibm.hpsys/v1alpha1
kind: Dataset
metadata:
  name: docker-cache-s3-dataset
spec:
  local:
    type: "COS"
    accessKeyID: "adf"
    secretAccessKey: "adfadsfasdfa"
    endpoint: "http://somes3.com"
    bucket: "docker-cache"
    readonly: "false"
---
kind: ConfigMap
apiVersion: v1
metadata:
  name: dind-daemon-test
  namespace: adminstuff
data:
  config.json: |
    {
            "auths": {
                    "docker.example.com": {
                            "auth": "sfdadfa"
                    },
                    "https://index.docker.io/v1/": {
                            "auth": "adfadsfa="
                    },
                    "pvreg.example.com": {
                            "auth": "adfadfa="
                    }
                    },

      "proxies":
      {
        "default":
        {
          "httpProxy": "http://examples.com",
          "httpsProxy": "http://examples.com",
        }
      }
    }
---
kind: Deployment
apiVersion: apps/v1
metadata:
  name: dockerbuilder-test
  namespace: adminstuff
  labels:
    app: dockerbuilder
spec:
  replicas: 1
  selector:
    matchLabels:
      app: dockerbuilder
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: dockerbuilder
    spec:
      volumes:
        - name: "docker-dir"
          persistentVolumeClaim:
            claimName: "docker-cache-s3-dataset"
        - name: dind-daemon
          configMap:
            name: dind-daemon
            defaultMode: 420
      containers:
        - name: dind
          image: 'docker:23.0.2-dind'
          command:
            - dockerd-entrypoint.sh
            - '--insecure-registry=pvreg.example.com'
            - '--registry-mirror=https://pvreg.example.com'
          ports:
            - containerPort: 2375
              protocol: TCP
          env:
            - name: http_proxy
              value: https://pvreg.example.com
            - name: https_proxy
              value: https://pvreg.example.com
            - name: DOCKER_OPTS
              value: >-
                -H tcp://0.0.0.0:2375
            - name: DOCKER_TLS_CERTDIR
          volumeMounts:
            - name: "docker-dir"
              mountPath: /var/lib/docker
            - name: dind-daemon
              mountPath: /root/.docker/config.json 
              subPath: config.json
          imagePullPolicy: IfNotPresent
          securityContext:
            privileged: true
srikumar003 commented 1 year ago

@morteza1131 i would really not recommend changing your docker-volumes to use the volumes provided by csi-s3. S3 volumes are not POSIX compliant, so there could be serious problems when docker daemon provisions the filesystem for the containers.

morteza1131 commented 1 year ago

I don't want to use it as docker container volume, I only need it to use as docker daemon root volume to store container images build cache.