Linkerd: Errors meshing Longhorn

Issue Description

When installing Longhorn in a namespace annotated to inject automatically linkerd-proxy (implicit annotation), Longhorn installation hangs do not deploying CSI plugin components. longhorn-driver-deployer pod keep son restarting and crashing every time with the same error "Got an error when checking MountProgapgation with node status, Node XX is not support mount propagation" and "Error deploying driver: CSI cannot be deployed because MountPropagation is not set: Node XX is not support mount propagation"

oss@node1:~$ kubectl get pods -n longhorn-system
NAME                                        READY   STATUS             RESTARTS        AGE
longhorn-ui-84f97cf569-c6fcq                2/2     Running            0               15m
longhorn-manager-cc7tt                      2/2     Running            0               15m
longhorn-manager-khrx6                      2/2     Running            0               15m
longhorn-manager-84zwx                      2/2     Running            0               15m
instance-manager-e-86ec5db9                 2/2     Running            0               14m
instance-manager-r-cdcb538b                 2/2     Running            0               14m
engine-image-ei-4dbdb778-bc9q2              2/2     Running            0               14m
instance-manager-r-0d602480                 2/2     Running            0               14m
instance-manager-e-429cc6bf                 2/2     Running            0               14m
instance-manager-e-af266d5e                 2/2     Running            0               14m
instance-manager-r-02fad614                 2/2     Running            0               14m
engine-image-ei-4dbdb778-2785c              2/2     Running            0               14m
engine-image-ei-4dbdb778-xrj4k              2/2     Running            0               14m
longhorn-driver-deployer-6bc898bc7b-nmzkc   1/2     CrashLoopBackOff   6 (2m46s ago)   15m

longhorn-dirver-deployer logs:

oss@node1:~$ kubectl logs longhorn-driver-deployer-6bc898bc7b-nmzkc longhorn-driver-deployer -n longhorn-system
2022/03/20 12:42:58 proto: duplicate proto type registered: VersionResponse
W0320 12:42:58.148324       1 client_config.go:552] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
time="2022-03-20T12:43:48Z" level=warning msg="Got an error when checking MountPropagation with node status, Node node2 is not support mount propagation"
time="2022-03-20T12:43:48Z" level=fatal msg="Error deploying driver: CSI cannot be deployed because MountPropagation is not set: Node node2 is not support mount propagation"

longhorn-manager logs does not show any informative log

10.42.3.6 - - [24/Mar/2022:11:16:42 +0000] "GET /v1 HTTP/1.1" 200 3871 "" "Go-http-client/1.1"
10.42.3.6 - - [24/Mar/2022:11:16:52 +0000] "GET /v1/nodes/node2 HTTP/1.1" 200 1393 "" "Go-http-client/1.1"
10.42.3.6 - - [24/Mar/2022:11:17:02 +0000] "GET /v1/nodes/node2 HTTP/1.1" 200 1393 "" "Go-http-client/1.1"
10.42.3.6 - - [24/Mar/2022:11:17:12 +0000] "GET /v1/nodes/node2 HTTP/1.1" 200 1393 "" "Go-http-client/1.1"
10.42.3.6 - - [24/Mar/2022:11:17:22 +0000] "GET /v1/nodes/node2 HTTP/1.1" 200 1393 "" "Go-http-client/1.1"
10.42.3.6 - - [24/Mar/2022:11:17:32 +0000] "GET /v1/nodes/node2 HTTP/1.1" 200 1393 "" "Go-http-client/1.1"
time="2022-03-24T11:21:41Z" level=debug msg="Trigger sync backup target" component=backup-store-timer controller=longhorn-setting interval=5m0s node=node2
10.42.3.6 - - [24/Mar/2022:11:22:36 +0000] "GET /v1 HTTP/1.1" 200 3871 "" "Go-http-client/1.1"
10.42.3.6 - - [24/Mar/2022:11:22:46 +0000] "GET /v1/nodes/node2 HTTP/1.1" 200 1393 "" "Go-http-client/1.1"
10.42.3.6 - - [24/Mar/2022:11:22:51 +0000] "GET /v1/nodes/node2 HTTP/1.1" 200 1393 "" "Go-http-client/1.1"
10.42.3.6 - - [24/Mar/2022:11:23:01 +0000] "GET /v1/nodes/node2 HTTP/1.1" 200 1393 "" "Go-http-client/1.1"
10.42.3.6 - - [24/Mar/2022:11:23:11 +0000] "GET /v1/nodes/node2 HTTP/1.1" 200 1393 "" "Go-http-client/1.1"
10.42.3.6 - - [24/Mar/2022:11:23:16 +0000] "GET /v1/nodes/node2 HTTP/1.1" 200 1393 "" "Go-http-client/1.1"

Root cause is that longhorn-driver-deployer is not able to communicate with longhorn-manager when checking the MointPropagation option. The log error printed by the pod is misleading.

This error in the connectivity can be shown checking linkerd-proxy sidecar logs:

oss@node1:~$ kubectl logs longhorn-manager-x7h2t linkerd-proxy -n longhorn-system
[     0.002680s]  INFO ThreadId(01) linkerd2_proxy::rt: Using single-threaded proxy runtime
[     0.005036s]  INFO ThreadId(01) linkerd2_proxy: Admin interface on 0.0.0.0:4191
[     0.005103s]  INFO ThreadId(01) linkerd2_proxy: Inbound interface on 0.0.0.0:4143
[     0.005116s]  INFO ThreadId(01) linkerd2_proxy: Outbound interface on 127.0.0.1:4140
[     0.005125s]  INFO ThreadId(01) linkerd2_proxy: Tap DISABLED
[     0.005135s]  INFO ThreadId(01) linkerd2_proxy: Local identity is longhorn-service-account.longhorn-system.serviceaccount.identity.linkerd.cluster.local
[     0.005147s]  INFO ThreadId(01) linkerd2_proxy: Identity verified via linkerd-identity-headless.linkerd.svc.cluster.local:8080 (linkerd-identity.linkerd.serviceaccount.identity.linkerd.cluster.local)
[     0.005158s]  INFO ThreadId(01) linkerd2_proxy: Destinations resolved via linkerd-dst-headless.linkerd.svc.cluster.local:8086 (linkerd-destination.linkerd.serviceaccount.identity.linkerd.cluster.local)
[     0.068813s]  INFO ThreadId(02) daemon:identity: linkerd_app: Certified identity: longhorn-service-account.longhorn-system.serviceaccount.identity.linkerd.cluster.local
[    74.392084s]  INFO ThreadId(01) inbound:server{port=9500}:rescue{client.addr=10.42.3.5:37924}: linkerd_app_core::errors::respond: Request failed error=error trying to connect: Connection refused (os error 111)
[    76.427399s]  INFO ThreadId(01) inbound:server{port=9500}:rescue{client.addr=10.42.3.5:38066}: linkerd_app_core::errors::respond: Request failed error=error trying to connect: Connection refused (os error 111)
[    78.465339s]  INFO ThreadId(01) inbound:server{port=9500}:rescue{client.addr=10.42.3.5:38086}: linkerd_app_core::errors::respond: Request failed error=error trying to connect: Connection refused (os error 111)
[    80.505249s]  INFO ThreadId(01) inbound:server{port=9500}:rescue{client.addr=10.42.3.5:38094}: linkerd_app_core::errors::respond: Request failed error=error trying to connect: Connection refused (os error 111)

Last log lines shows that inbound connections to longhorn-manager (tcp port 9500) are being refused at OS level. Linkerd is trying to route the traffic coming from longhorn-driver-deployer but the connection is refused by longhorn-manager process.

The problem is that linkerd-proxy when forwarding the incoming traffic uses as source IP of the connection localhost but longhorn-manager process is only listening at POD IP address, not at localhost and so the connection is refused.

This commands show longhorn-manager container opening Port 9500 only at the POD IP (10.42.3.6).

oss@node1:~$ kubectl exec -it longhorn-manager-x7h2t -c longhorn-manager -n longhorn-system -- /bin/bash
root@longhorn-manager-x7h2t:/# ss -tunlp
Netid   State    Recv-Q   Send-Q      Local Address:Port       Peer Address:Port   Process                                    
tcp     LISTEN   0        4096            127.0.0.1:6060            0.0.0.0:*       users:(("longhorn-manage",pid=1,fd=39))   
tcp     LISTEN   0        128             127.0.0.1:4140            0.0.0.0:*                                                 
tcp     LISTEN   0        128               0.0.0.0:4143            0.0.0.0:*                                                 
tcp     LISTEN   0        4096            10.42.3.6:9500            0.0.0.0:*       users:(("longhorn-manage",pid=1,fd=41))   
tcp     LISTEN   0        128               0.0.0.0:4191            0.0.0.0:*

Linkerd iptables forwarding rules makes that all received traffic by the meshed containers appears to come from localhost. There is a linkerd open issue (https://github.com/linkerd/linkerd2/issues/4713) for changing linkerd's default behavior and keep IP addresses when forwarding the traffic using TPROXY.

There is a longhorn open issue https://github.com/longhorn/longhorn/issues/1315 with a similar issue when trying to mesh with Istio. As a workarround it is proposed to change longhorn-manager POD_IP environment variable

longhorn-manager container selects IP to listen from environment variable POD_IP which points to the assigned ip to the POD.

Extracted form daemonset definition:

env:
- name: POD_NAMESPACE
  valueFrom:
    fieldRef:
      fieldPath: metadata.namespace
- name: POD_IP
  valueFrom:
    fieldRef:
      fieldPath: status.podIP

How to change environment variables of a running POD

1) Use kubectl set env --list command to get environment variables

oss@node1:~$ kubectl set env daemonset/longhorn-manager -n longhorn-system --list
# DaemonSet longhorn-manager, container longhorn-manager
# POD_NAMESPACE from field path metadata.namespace
# POD_IP from field path status.podIP
# NODE_NAME from field path spec.nodeName
DEFAULT_SETTING_PATH=/var/lib/longhorn-setting/default-setting.yaml

2) Use kubectl set env VAR=value command to set the value of the new environment variable

Change daemonset POD_IP environment variable

kubectl set env daemonset/longhorn-manager -n longhorn-system POD_IP=0.0.0.0

After the change the daemonset is redeploy.

Check daemon set environment variables

oss@node1:~$ kubectl set env daemonset/longhorn-manager -n longhorn-system --list
# DaemonSet longhorn-manager, container longhorn-manager
# POD_NAMESPACE from field path metadata.namespace
POD_IP=0.0.0.0
# NODE_NAME from field path spec.nodeName
DEFAULT_SETTING_PATH=/var/lib/longhorn-setting/default-setting.yaml

Using Kustomize with Helm to automatically modify POD_IP

Helm provides the possibility of manipulate, configure, and/or validate rendered manifests before they are installed by Helm: --post-rendering option. This enables the use of kustomize to apply configuration changes without the need to fork a public chart or requiring chart maintainers to specify every last configuration option for a piece of software.

Since v1.14, kubectl includes kustomize support:

kubectl kustomize <kustomization_directory>
kubectl apply -k <kustomization_directory>

Based on procedure described in this post kustomize can be used to apply patches to manifest files generated by Helm before install them.

Step 1: Create directory kustomize
```
mkdir kustomize
```
Step 2: Create kustomize wrapper script within kustomize directory
```
#!/bin/bash

# save incoming YAML to file
cat <&0 > all.yaml

# modify the YAML with kustomize
kubectl kustomize . && rm all.yaml
```
The script simply save all incomming files from helm to a temporal file all.yml and then execute kubectl kustomize to the current directory and finally remove the temporal file

Step 3: Create kutomize files

kustomization.yml

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

resources:
- all.yaml
patches:
- path: patch.yaml
  target:
    kind: DaemonSet
    name: "longhorn-manager"

This file indicates to patch DaemonSet longhorn-manager within all.yml file using patch.yaml

patch.yaml

apiVersion: apps/v1
kind: DaemonSet
metadata:
name: longhorn-manager
spec:
template:
  spec:
    containers:
      - name: longhorn-manager # (1)
        env:
          - name: POD_IP
            value: 0.0.0.0
            valueFrom:

NOTE: It is needed to set null to key valueFrom in order to delete previous value.

Step 3: Execute dry-run of helm install to see the changes in the manifests files

helm install longhorn longhorn/longhorn -f ../longhorn_values.yml --post-renderer ./kustomize --debug --dry-run

Step 4: Deploy the helm

helm install longhorn longhorn/longhorn -f ../longhorn_values.yml --post-renderer ./kustomize --namespace longhorn-system

NOTE: Ansible does not support yet --post-rendering option to helm module. Open issue in asible for providing this functionallity (https://github.com/ansible-collections/kubernetes.core/issues/30)

Longhorn still does not completely start when patching the POD_IP environment using the previously described procedure (helm + post-rendering with kustomize) and including the corresponding helm chart parameter (annotations in values.yml), which automatically add the linkerd annotation to longhorn-manager daemon-set manifest file.

defaultSettings:
  defaultDataPath: "/storage"
annotations:
  linkerd.io/inject: enabled

longhorn-driver-deployer keep failing with the same original message "Got an error when checking MountProgapgation with node status, Node XX is not support mount propagation" and "Error deploying driver: CSI cannot be deployed because MountPropagation is not set: Node XX is not support mount propagation".

Now linkerd-proxy sidecar containers are not showing any OS level connection errors since longhorn-manager is listening in all pod IP addresses including localhost.

longhorn-manager does not show any log message.

Longhorn provides a environment_check.sh script that makes the mount propagation check validation.

Executing the script in my environment shows that the mount propagation is enabled in my cluster:

daemonset.apps/longhorn-environment-check created
waiting for pods to become ready (0/3)
all pods ready (3/3)

  MountPropagation is enabled!

cleaning up...
daemonset.apps "longhorn-environment-check" deleted
clean up complete

Patching the script to add linkerd implicit annotation to the test daemonset, so linkerd-proxy is automatically injected, makes the script to fail the validation.

Modified check script: just adding linkerd annotation to daemonset manifest file (create_ds function):

#!/bin/bash

dependencies() {
  local targets=($@)
  local allFound=true
  for ((i=0; i<${#targets[@]}; i++)); do
    local target=${targets[$i]}
    if [ "$(which $target)" == "" ]; then
      allFound=false
      echo Not found: $target
    fi
  done
  if [ "$allFound" == "false" ]; then
    echo "Please install missing dependencies."
    exit 2
  fi
}

create_ds() {
cat <<EOF > $TEMP_DIR/environment_check.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
  labels:
    app: longhorn-environment-check
  name: longhorn-environment-check
spec:
  selector:
    matchLabels:
      app: longhorn-environment-check
  template:
    metadata:
      labels:
        app: longhorn-environment-check
      annotations:
        linkerd.io/inject: enabled
    spec:
      containers:
      - name: longhorn-environment-check
        image: busybox
        args: ["/bin/sh", "-c", "sleep 1000000000"]
        volumeMounts:
        - name: mountpoint
          mountPath: /tmp/longhorn-environment-check
          mountPropagation: Bidirectional
        securityContext:
          privileged: true
      volumes:
      - name: mountpoint
        hostPath:
            path: /tmp/longhorn-environment-check
EOF
  kubectl create -f $TEMP_DIR/environment_check.yaml
}

cleanup() {
  echo "cleaning up..."
  kubectl delete -f $TEMP_DIR/environment_check.yaml
  rm -rf $TEMP_DIR
  echo "clean up complete"
}

wait_ds_ready() {
  while true; do
    local ds=$(kubectl get ds/longhorn-environment-check -o json)
    local numberReady=$(echo $ds | jq .status.numberReady)
    local desiredNumberScheduled=$(echo $ds | jq .status.desiredNumberScheduled)

    if [ "$desiredNumberScheduled" == "$numberReady" ] && [ "$desiredNumberScheduled" != "0" ]; then
      echo "all pods ready ($numberReady/$desiredNumberScheduled)"
      return
    fi

    echo "waiting for pods to become ready ($numberReady/$desiredNumberScheduled)"
    sleep 3
  done
}

validate_ds() {
  local allSupported=true
  local pods=$(kubectl -l app=longhorn-environment-check get po -o json)

  local ds=$(kubectl get ds/longhorn-environment-check -o json)
  local desiredNumberScheduled=$(echo $ds | jq .status.desiredNumberScheduled)

  for ((i=0; i<desiredNumberScheduled; i++)); do
    local pod=$(echo $pods | jq .items[$i])
    local nodeName=$(echo $pod | jq -r .spec.nodeName)
    local mountPropagation=$(echo $pod | jq -r '.spec.containers[0].volumeMounts[] | select(.name=="mountpoint") | .mountPropagation')

    if [ "$mountPropagation" != "Bidirectional" ]; then
      allSupported=false
      echo "node $nodeName: MountPropagation DISABLED"
    fi
  done

  if [ "$allSupported" != "true" ]; then
    echo
    echo "  MountPropagation is disabled on at least one node."
    echo "  As a result, CSI driver and Base image cannot be supported."
    echo
    exit 1
  else
    echo -e "\n  MountPropagation is enabled!\n"
  fi
}

dependencies kubectl jq mktemp
TEMP_DIR=$(mktemp -d)
trap cleanup EXIT
create_ds
wait_ds_ready
validate_ds
exit 0

Modified script output:

daemonset.apps/longhorn-environment-check created
waiting for pods to become ready (0/3)
waiting for pods to become ready (0/3)
waiting for pods to become ready (1/3)
all pods ready (3/3)
node node4: MountPropagation DISABLED
node node2: MountPropagation DISABLED
node node3: MountPropagation DISABLED

  MountPropagation is disabled on at least one node.
  As a result, CSI driver and Base image cannot be supported.

cleaning up...
daemonset.apps "longhorn-environment-check" deleted
clean up complete

Looking at the validation logic in the script (validate_ds function) we can see that the validation consist in checking if the the volume marked as mountPropagation=Bidirectional in the daemons set appears in the the deployed pod looking at spec.containers[0].volumeMount

The error is that the validation is assuming that the testing pod only contains one single container (spec.container[0]), but when injecting automatically linkerd-proxy sidecar container, the first container in the pod spec is linkerd and not the testing container and so the validation logic fails.

The same logic is in longhorn-manager code: controller/node_controller.go

func (nc *NodeController) syncNodeStatus(pod *v1.Pod, node *longhorn.Node) error {
    // sync bidirectional mount propagation for node status to check whether the node could deploy CSI driver
    for _, mount := range pod.Spec.Containers[0].VolumeMounts {
        if mount.Name == types.LonghornSystemKey {
            mountPropagationStr := ""
            if mount.MountPropagation == nil {
                mountPropagationStr = "nil"
            } else {
                mountPropagationStr = string(*mount.MountPropagation)
            }
            if mount.MountPropagation == nil || *mount.MountPropagation != v1.MountPropagationBidirectional {
                node.Status.Conditions = types.SetCondition(node.Status.Conditions, longhorn.NodeConditionTypeMountPropagation, longhorn.ConditionStatusFalse,
                    string(longhorn.NodeConditionReasonNoMountPropagationSupport),
                    fmt.Sprintf("The MountPropagation value %s is not detected from pod %s, node %s", mountPropagationStr, pod.Name, pod.Spec.NodeName))
            } else {
                node.Status.Conditions = types.SetCondition(node.Status.Conditions, longhorn.NodeConditionTypeMountPropagation, longhorn.ConditionStatusTrue, "", "")
            }
            break
        }
    }

    return nil
}

This logic will always fail since the injected longhorn-manager pod has in its spec as first container the sidecar container (linkerd-proxy) and not the container containing the testing volume mount (longhorn-manager container)

This is the injected pod definition:

apiVersion: v1
kind: Pod
metadata:ubectl -l app=longhorn-environment-check get po -o json`) the

```yml
apiVersion: v1
kind: Pod
metadata:
  annotations:
    linkerd.io/created-by: linkerd/proxy-injector stable-2.11.1
    linkerd.io/identity-mode: default
    linkerd.io/inject: enabled
    linkerd.io/proxy-version: ""
  creationTimestamp: "2022-03-31T16:11:53Z"
  generateName: longhorn-manager-
  labels:
    app: longhorn-manager
    app.kubernetes.io/instance: longhorn
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: longhorn
    app.kubernetes.io/version: v1.2.4
    controller-revision-hash: 6bf5cb878f
    helm.sh/chart: longhorn-1.2.4
    linkerd.io/control-plane-ns: linkerd
    linkerd.io/proxy-daemonset: longhorn-manager
    linkerd.io/workload-ns: longhorn-system
    pod-template-generation: "1"
  name: longhorn-manager-595f4
  namespace: longhorn-system
  ownerReferences:
  - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: DaemonSet
    name: longhorn-manager
    uid: 69bb999c-7e49-4f81-822a-f52134df0195
  resourceVersion: "2091"
  uid: f2bdeb4d-e79c-4742-82bb-85fc7b787089
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchFields:
          - key: metadata.name
            operator: In
            values:
            - node4
  containers:
  - env:
    - name: _pod_name
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.name
    - name: _pod_ns
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.namespace
    - name: _pod_nodeName
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: spec.nodeName
    - name: LINKERD2_PROXY_LOG
      value: warn,linkerd=info
    - name: LINKERD2_PROXY_LOG_FORMAT
      value: plain
    - name: LINKERD2_PROXY_DESTINATION_SVC_ADDR
      value: linkerd-dst-headless.linkerd.svc.cluster.local.:8086
    - name: LINKERD2_PROXY_DESTINATION_PROFILE_NETWORKS
      value: 10.0.0.0/8,100.64.0.0/10,172.16.0.0/12,192.168.0.0/16
    - name: LINKERD2_PROXY_POLICY_SVC_ADDR
      value: linkerd-policy.linkerd.svc.cluster.local.:8090
    - name: LINKERD2_PROXY_POLICY_WORKLOAD
      value: $(_pod_ns):$(_pod_name)
    - name: LINKERD2_PROXY_INBOUND_DEFAULT_POLICY
      value: all-unauthenticated
    - name: LINKERD2_PROXY_POLICY_CLUSTER_NETWORKS
      value: 10.0.0.0/8,100.64.0.0/10,172.16.0.0/12,192.168.0.0/16
    - name: LINKERD2_PROXY_INBOUND_CONNECT_TIMEOUT
      value: 100ms
    - name: LINKERD2_PROXY_OUTBOUND_CONNECT_TIMEOUT
      value: 1000ms
    - name: LINKERD2_PROXY_CONTROL_LISTEN_ADDR
      value: 0.0.0.0:4190
    - name: LINKERD2_PROXY_ADMIN_LISTEN_ADDR
      value: 0.0.0.0:4191
    - name: LINKERD2_PROXY_OUTBOUND_LISTEN_ADDR
      value: 127.0.0.1:4140
    - name: LINKERD2_PROXY_INBOUND_LISTEN_ADDR
      value: 0.0.0.0:4143
    - name: LINKERD2_PROXY_INBOUND_IPS
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: status.podIPs
    - name: LINKERD2_PROXY_INBOUND_PORTS
      value: "9500"
    - name: LINKERD2_PROXY_DESTINATION_PROFILE_SUFFIXES
      value: svc.cluster.local.
    - name: LINKERD2_PROXY_INBOUND_ACCEPT_KEEPALIVE
      value: 10000ms
    - name: LINKERD2_PROXY_OUTBOUND_CONNECT_KEEPALIVE
      value: 10000ms
    - name: LINKERD2_PROXY_INBOUND_PORTS_DISABLE_PROTOCOL_DETECTION
      value: 25,587,3306,4444,5432,6379,9300,11211
    - name: LINKERD2_PROXY_DESTINATION_CONTEXT
      value: |
        {"ns":"$(_pod_ns)", "nodeName":"$(_pod_nodeName)"}
    - name: _pod_sa
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: spec.serviceAccountName
    - name: _l5d_ns
      value: linkerd
    - name: _l5d_trustdomain
      value: cluster.local
    - name: LINKERD2_PROXY_IDENTITY_DIR
      value: /var/run/linkerd/identity/end-entity
    - name: LINKERD2_PROXY_IDENTITY_TRUST_ANCHORS
      value: |
        -----BEGIN CERTIFICATE-----
        MIIBbjCCARSgAwIBAgIQeNw6P3sTzeEP6HWrqBVdeDAKBggqhkjOPQQDAjAXMRUw
        EwYDVQQDEwxwaWNsdXN0ZXItY2EwHhcNMjIwMzMxMTYwNjI5WhcNMjIwNjI5MTYw
        NjI5WjAXMRUwEwYDVQQDEwxwaWNsdXN0ZXItY2EwWTATBgcqhkjOPQIBBggqhkjO
        PQMBBwNCAAQ7BVZ9adHHTd0ls1xh2BZ7lu00mCu+nUlNCog3yJ6e5D9uV+ze/X6n
        xksXxnU6MPSjlB/TPVQPjKqZDjcufWKYo0IwQDAOBgNVHQ8BAf8EBAMCAqQwDwYD
        VR0TAQH/BAUwAwEB/zAdBgNVHQ4EFgQUagqJ7nsMA0GIjPz/Ye7HlOluYiYwCgYI
        KoZIzj0EAwIDSAAwRQIgDsq0d6g8cxhnS6Q0xTY9KfWZSY9DMibRoEHHFjsRGSYC
        IQDV1ARoQeIeYWfshIS7J0Hhf3YVwLhIFNXl813astvtcw==
        -----END CERTIFICATE-----
    - name: LINKERD2_PROXY_IDENTITY_TOKEN_FILE
      value: /var/run/secrets/kubernetes.io/serviceaccount/token
    - name: LINKERD2_PROXY_IDENTITY_SVC_ADDR
      value: linkerd-identity-headless.linkerd.svc.cluster.local.:8080
    - name: LINKERD2_PROXY_IDENTITY_LOCAL_NAME
      value: $(_pod_sa).$(_pod_ns).serviceaccount.identity.linkerd.cluster.local
    - name: LINKERD2_PROXY_IDENTITY_SVC_NAME
      value: linkerd-identity.linkerd.serviceaccount.identity.linkerd.cluster.local
    - name: LINKERD2_PROXY_DESTINATION_SVC_NAME
      value: linkerd-destination.linkerd.serviceaccount.identity.linkerd.cluster.local
    - name: LINKERD2_PROXY_POLICY_SVC_NAME
      value: linkerd-destination.linkerd.serviceaccount.identity.linkerd.cluster.local
    image: cr.l5d.io/linkerd/proxy:stable-2.11.1
    imagePullPolicy: IfNotPresent
    lifecycle:
      postStart:
        exec:
          command:
          - /usr/lib/linkerd/linkerd-await
    livenessProbe:
      failureThreshold: 3
      httpGet:
        path: /live
        port: 4191
        scheme: HTTP
      initialDelaySeconds: 10
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 1
    name: linkerd-proxy
    ports:
    - containerPort: 4143
      name: linkerd-proxy
      protocol: TCP
    - containerPort: 4191
      name: linkerd-admin
      protocol: TCP
    readinessProbe:
      failureThreshold: 3
      httpGet:
        path: /ready
        port: 4191
        scheme: HTTP
      initialDelaySeconds: 2
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 1
    resources: {}
    securityContext:
      allowPrivilegeEscalation: false
      readOnlyRootFilesystem: true
      runAsUser: 2102
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: FallbackToLogsOnError
    volumeMounts:
    - mountPath: /var/run/linkerd/identity/end-entity
      name: linkerd-identity-end-entity
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-lptk4
      readOnly: true
  - command:
    - longhorn-manager
    - -d
    - daemon
    - --engine-image
    - longhornio/longhorn-engine:v1.2.4
    - --instance-manager-image
    - longhornio/longhorn-instance-manager:v1_20220303
    - --share-manager-image
    - longhornio/longhorn-share-manager:v1_20211020
    - --backing-image-manager-image
    - longhornio/backing-image-manager:v2_20210820
    - --manager-image
    - longhornio/longhorn-manager:v1.2.4
    - --service-account
    - longhorn-service-account
    env:
    - name: POD_IP
      value: 0.0.0.0
    - name: POD_NAMESPACE
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.namespace
    - name: NODE_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: spec.nodeName
    - name: DEFAULT_SETTING_PATH
      value: /var/lib/longhorn-setting/default-setting.yaml
    image: longhornio/longhorn-manager:v1.2.4
    imagePullPolicy: IfNotPresent
    name: longhorn-manager
    ports:
    - containerPort: 9500
      name: manager
      protocol: TCP
    readinessProbe:
      failureThreshold: 3
      periodSeconds: 10
      successThreshold: 1
      tcpSocket:
        port: 9500
      timeoutSeconds: 1
    resources: {}
    securityContext:
      privileged: true
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /host/dev/
      name: dev
    - mountPath: /host/proc/
      name: proc
    - mountPath: /var/lib/longhorn/
      mountPropagation: Bidirectional
      name: longhorn
    - mountPath: /var/lib/longhorn-setting/
      name: longhorn-default-setting
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-lptk4
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  initContainers:
  - args:
    - --incoming-proxy-port
    - "4143"
    - --outgoing-proxy-port
    - "4140"
    - --proxy-uid
    - "2102"
    - --inbound-ports-to-ignore
    - 4190,4191,4567,4568
    - --outbound-ports-to-ignore
    - 4567,4568
    image: cr.l5d.io/linkerd/proxy-init:v1.4.0
    imagePullPolicy: IfNotPresent
    name: linkerd-init
    resources:
      limits:
        cpu: 100m
        memory: 50Mi
      requests:
        cpu: 10m
        memory: 10Mi
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        add:
        - NET_ADMIN
        - NET_RAW
      privileged: false
      readOnlyRootFilesystem: true
      runAsNonRoot: false
      runAsUser: 0
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: FallbackToLogsOnError
    volumeMounts:
    - mountPath: /run
      name: linkerd-proxy-init-xtables-lock
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-lptk4
      readOnly: true
  nodeName: node4
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: longhorn-service-account
  serviceAccountName: longhorn-service-account
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
  - effect: NoSchedule
    key: node.kubernetes.io/disk-pressure
    operator: Exists
  - effect: NoSchedule
    key: node.kubernetes.io/memory-pressure
    operator: Exists
  - effect: NoSchedule
    key: node.kubernetes.io/pid-pressure
    operator: Exists
  - effect: NoSchedule
    key: node.kubernetes.io/unschedulable
    operator: Exists
  volumes:
  - hostPath:
      path: /dev/
      type: ""
    name: dev
  - hostPath:
      path: /proc/
      type: ""
    name: proc
  - hostPath:
      path: /var/lib/longhorn/
      type: ""
    name: longhorn
  - configMap:
      defaultMode: 420
      name: longhorn-default-setting
    name: longhorn-default-setting
  - name: kube-api-access-lptk4
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          expirationSeconds: 3607
          path: token
      - configMap:
          items:
          - key: ca.crt
            path: ca.crt
          name: kube-root-ca.crt
      - downwardAPI:
          items:
          - fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
            path: namespace
  - emptyDir: {}
    name: linkerd-proxy-init-xtables-lock
  - emptyDir:
      medium: Memory
    name: linkerd-identity-end-entity
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2022-03-31T16:11:57Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2022-03-31T16:12:41Z"
    status: "True"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2022-03-31T16:12:41Z"
    status: "True"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2022-03-31T16:11:54Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: containerd://8128e6cc190e65dc21f5f26ce79a579622189a7a2221ea9b2a64a43b259f80e1
    image: cr.l5d.io/linkerd/proxy:stable-2.11.1
    imageID: cr.l5d.io/linkerd/proxy@sha256:91b53d4b39e4c058e5fc63b72dd7ab6fe7f7051869ec5251dc9c0d8287b2771f
    lastState: {}
    name: linkerd-proxy
    ready: true
    restartCount: 0
    started: true
    state:
      running:
        startedAt: "2022-03-31T16:11:57Z"
  - containerID: containerd://22f1359639497b20d929978c5b88fb84509dd150744549bae30280a08ec88e40
    image: docker.io/longhornio/longhorn-manager:v1.2.4
    imageID: docker.io/longhornio/longhorn-manager@sha256:f418c79b3cb91ed1dcd28565938401360948afe17c694d65c51d825b0549f21b
    lastState: {}
    name: longhorn-manager
    ready: true
    restartCount: 0
    started: true
    state:
      running:
        startedAt: "2022-03-31T16:12:40Z"
  hostIP: 10.0.0.14
  initContainerStatuses:
  - containerID: containerd://b64263c27b07b3fa262e1acdcfe21a60ea4ef9d734ad68ca6a496522478c6c27
    image: cr.l5d.io/linkerd/proxy-init:v1.4.0
    imageID: cr.l5d.io/linkerd/proxy-init@sha256:60d12fbb0b4a53962a5c2a59b496b3ee20052d26c0c56fd2ee38fd7fae62146e
    lastState: {}
    name: linkerd-init
    ready: true
    restartCount: 0
    state:
      terminated:
        containerID: containerd://b64263c27b07b3fa262e1acdcfe21a60ea4ef9d734ad68ca6a496522478c6c27
        exitCode: 0
        finishedAt: "2022-03-31T16:11:57Z"
        reason: Completed
        startedAt: "2022-03-31T16:11:56Z"
  phase: Running
  podIP: 10.42.2.5
  podIPs:
  - ip: 10.42.2.5
  qosClass: Burstable
  startTime: "2022-03-31T16:11:54Z"

Opened a bug in longhorn repository (https://github.com/longhorn/longhorn/issues/3809)

Close by mistake.

As part of PR #48, longhorn-manager and longhorn-ui have been meshed with linkerd after Longhorn has been successfully deployed. 1) longhorn-manager POD_IP environment variable is changed 2) longhorn-manager daemonset and longhorn-ui deployment are patched to include linkerd annotation.

Keep this issue open till the longhorn bug opened is solved.

PR fixing longhorn raised issue has been merged into master branch. https://github.com/longhorn/longhorn-manager/pull/2389

This PR is included into 1.6.0 version. See release notes: https://github.com/longhorn/longhorn/releases/tag/v1.6.0

Linkerd is going to be deprecated as sevice mesh solution in the cluster. See #320

No need to fix this issue.

ricsanfre / pi-cluster