splunk / splunk-operator

Splunk Operator for Kubernetes
Other
205 stars 113 forks source link

App Framework: Deployment breaks if Istio is used #997

Open markusspitzli opened 1 year ago

markusspitzli commented 1 year ago

Please select the type of request

Bug

Tell us more

Describe the request The app deployment is broken when istio is used by the splunk operator and the splunk enterprise instances. If Istio is disabled the app deployment works fine.

The splunk operator throws the following error. 2022-12-05T11:42:28.555443825Z ERROR afwSchedulerEntry unable to create directory on splunk pod {"controller": "standalone", "controllerGroup": "enterprise.splunk.com", "controllerKind": "Standalone", "Standalone": {"name":"mycompany-int","namespace":"mycompany-int"}, "namespace": "mycompany-int", "name": "mycompany-int", "reconcileID": "032c228b-b03f-4e5c-b3f0-44cf0a7e9b85", "name": "mycompany-int", "namespace": "mycompany-int", "error": "unable to create directory on Pod at path=/operator-staging/appframework/local. stdout: , stdErr: , err: a container name must be specified for pod splunk-mycompany-int-standalone-0, choose one of: [istio-init splunk istio-proxy]"}

Expected behavior app deployment takes istio into account and delivers the app in the appropriate container.

Splunk setup on K8S The Splunk Operator (2.1.0) and Standalone Splunk Enterprise (9.0.2) Instance are installed in the same namespace.

Reproduction/Testing steps

  1. Install Istio
  2. create a namespace with istio enabled: istio-injection: enabled
  3. deploy splunk operator in this namespace
  4. deploy a standalone instance in the same namespace
  5. try to deploy a app by using the app framework.

K8s environment K8s Version 1.22.12

Proposed changes(optional) treat pods as multi container and specify the splunk container you want to deploy apps.

sgontla commented 1 year ago

@markusspitzli , can you share the output of kubectl describe pod <podname>.

markusspitzli commented 1 year ago

@sgontla here you go

Name:         splunk-mycompany-int-standalone-0
Namespace:    mycompany-int
Priority:     0
Node:         32fb9f24-0118-49ab-bce7-96416aacbdaa/11.0.78.9
Start Time:   Mon, 05 Dec 2022 12:39:17 +0100
Labels:       app.kubernetes.io/component=standalone
              app.kubernetes.io/instance=splunk-mycompany-int-standalone
              app.kubernetes.io/managed-by=splunk-operator
              app.kubernetes.io/name=standalone
              app.kubernetes.io/part-of=splunk-mycompany-int-standalone
              controller-revision-hash=splunk-mycompany-int-standalone-69cf7f44d5
              security.istio.io/tlsMode=istio
              service.istio.io/canonical-name=standalone
              service.istio.io/canonical-revision=latest
              statefulset.kubernetes.io/pod-name=splunk-mycompany-int-standalone-0
Annotations:  argocd.argoproj.io/sync-wave: 3
              defaultConfigRev: 177958100
              kubectl.kubernetes.io/default-container: splunk
              kubectl.kubernetes.io/default-logs-container: splunk
              kubernetes.io/psp: pks-privileged
              prometheus.io/path: /stats/prometheus
              prometheus.io/port: 15020
              prometheus.io/scrape: true
              sidecar.istio.io/status:
                {"initContainers":["istio-init"],"containers":["istio-proxy"],"volumes":["workload-socket","credential-socket","workload-certs","istio-env...
              traffic.sidecar.istio.io/excludeOutboundPorts: 8089,8191,9997
              traffic.sidecar.istio.io/includeInboundPorts: 8000,8088
Status:       Running
IP:           11.32.109.3
IPs:
  IP:           11.32.109.3
Controlled By:  StatefulSet/splunk-mycompany-int-standalone
Init Containers:
  istio-init:
    Container ID:  docker://6c328d0a4a62b2ecdddff3201762be9659cffdf0b16f21c1ff41e90429a6d911
    Image:         remote-docker.artifactory.mycompany.com/istio/proxyv2:1.15.3
    Image ID:      docker-pullable://remote-docker.artifactory.mycompany.com/istio/proxyv2@sha256:de42717d56b022c5f469a892cdff28ae045476c59ad818ca2732bac51d076b19
    Port:          <none>
    Host Port:     <none>
    Args:
      istio-iptables
      -p
      15001
      -z
      15006
      -u
      1337
      -m
      REDIRECT
      -i
      *
      -x

      -b
      8000,8088
      -d
      15090,15021,15020
      -o
      8089,8191,9997
      --log_output_level=default:info
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Mon, 05 Dec 2022 12:39:34 +0100
      Finished:     Mon, 05 Dec 2022 12:39:35 +0100
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     2
      memory:  1Gi
    Requests:
      cpu:        100m
      memory:     128Mi
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-45h8z (ro)
Containers:
  splunk:
    Container ID:   docker://bab010ab299d5e76ecfc40658d4723476f5ba9a3321476fd1ac69f0095a46665
    Image:          remote-docker.artifactory.mycompany.com/splunk/splunk:9.0.2
    Image ID:       docker-pullable://remote-docker.artifactory.mycompany.com/splunk/splunk@sha256:0be31e80a19326e83ff31f3b8d0b71c72f577b80157cf76cab4ab8cd64e7daa0
    Ports:          8000/TCP, 8088/TCP, 8089/TCP, 9997/TCP
    Host Ports:     0/TCP, 0/TCP, 0/TCP, 0/TCP
    State:          Running
      Started:      Mon, 05 Dec 2022 12:39:36 +0100
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     8
      memory:  4Gi
    Requests:
      cpu:      100m
      memory:   4Gi
    Liveness:   exec [/mnt/probes/livenessProbe.sh] delay=30s timeout=30s period=30s #success=1 #failure=3
    Readiness:  exec [/bin/grep started /opt/container_artifact/splunk-container.state] delay=10s timeout=5s period=5s #success=1 #failure=3
    Startup:    exec [/mnt/probes/startupProbe.sh] delay=40s timeout=30s period=30s #success=1 #failure=12
    Environment:
      SPLUNK_DECLARATIVE_ADMIN_PASSWORD:             true
      SPLUNK_DEFAULTS_URL:                           /mnt/splunk-defaults/default.yml,/mnt/license-master/default.yaml,/mnt/splunk-secrets/default.yml
      SPLUNK_HOME:                                   /opt/splunk
      SPLUNK_HOME_OWNERSHIP_ENFORCEMENT:             false
      SPLUNK_OPERATOR_K8_LIVENESS_DRIVER_FILE_PATH:  /tmp/splunk_operator_k8s/probes/k8_liveness_driver.sh
      SPLUNK_ROLE:                                   splunk_standalone
      SPLUNK_START_ARGS:                             --accept-license
      TZ:                                            Europe/Zurich
    Mounts:
      /mnt/license-master from license-master (rw)
      /mnt/probes from splunk-mycompany-int-probe-configmap (rw)
      /mnt/splunk-defaults from mnt-splunk-defaults (rw)
      /mnt/splunk-secrets from mnt-splunk-secrets (rw)
      /operator-staging/ from operator-staging (rw)
      /opt/splunk/etc from pvc-etc (rw)
      /opt/splunk/var from pvc-var (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-45h8z (ro)
  istio-proxy:
    Container ID:  docker://cf1e6adbd46b7d01095da2169da5b3f9503fd48c03f53bebf8dc2fdd222f88ee
    Image:         remote-docker.artifactory.mycompany.com/istio/proxyv2:1.15.3
    Image ID:      docker-pullable://remote-docker.artifactory.mycompany.com/istio/proxyv2@sha256:de42717d56b022c5f469a892cdff28ae045476c59ad818ca2732bac51d076b19
    Port:          15090/TCP
    Host Port:     0/TCP
    Args:
      proxy
      sidecar
      --domain
      $(POD_NAMESPACE).svc.cluster.local
      --proxyLogLevel=warning
      --proxyComponentLogLevel=misc:error
      --log_output_level=default:info
      --concurrency
      2
    State:          Running
      Started:      Mon, 05 Dec 2022 12:39:36 +0100
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     2
      memory:  1Gi
    Requests:
      cpu:      100m
      memory:   128Mi
    Readiness:  http-get http://:15021/healthz/ready delay=1s timeout=3s period=2s #success=1 #failure=30
    Environment:
      JWT_POLICY:                    third-party-jwt
      PILOT_CERT_PROVIDER:           istiod
      CA_ADDR:                       istiod.istio-system.svc:15012
      POD_NAME:                      splunk-mycompany-int-standalone-0 (v1:metadata.name)
      POD_NAMESPACE:                 mycompany-int (v1:metadata.namespace)
      INSTANCE_IP:                    (v1:status.podIP)
      SERVICE_ACCOUNT:                (v1:spec.serviceAccountName)
      HOST_IP:                        (v1:status.hostIP)
      PROXY_CONFIG:                  {}

      ISTIO_META_POD_PORTS:          [
                                         {"name":"http-splunkweb","containerPort":8000,"protocol":"TCP"}
                                         ,{"name":"http-hec","containerPort":8088,"protocol":"TCP"}
                                         ,{"name":"https-splunkd","containerPort":8089,"protocol":"TCP"}
                                         ,{"name":"tcp-s2s","containerPort":9997,"protocol":"TCP"}
                                     ]
      ISTIO_META_APP_CONTAINERS:     splunk
      ISTIO_META_CLUSTER_ID:         Kubernetes
      ISTIO_META_INTERCEPTION_MODE:  REDIRECT
      ISTIO_META_WORKLOAD_NAME:      splunk-mycompany-int-standalone
      ISTIO_META_OWNER:              kubernetes://apis/apps/v1/namespaces/mycompany-int/statefulsets/splunk-mycompany-int-standalone
      ISTIO_META_MESH_ID:            cluster.local
      TRUST_DOMAIN:                  cluster.local
    Mounts:
      /etc/istio/pod from istio-podinfo (rw)
      /etc/istio/proxy from istio-envoy (rw)
      /var/lib/istio/data from istio-data (rw)
      /var/run/secrets/credential-uds from credential-socket (rw)
      /var/run/secrets/istio from istiod-ca-cert (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-45h8z (ro)
      /var/run/secrets/tokens from istio-token (rw)
      /var/run/secrets/workload-spiffe-credentials from workload-certs (rw)
      /var/run/secrets/workload-spiffe-uds from workload-socket (rw)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  workload-socket:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  credential-socket:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  workload-certs:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  istio-envoy:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     Memory
    SizeLimit:  <unset>
  istio-data:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  istio-podinfo:
    Type:  DownwardAPI (a volume populated by information about the pod)
    Items:
      metadata.labels -> labels
      metadata.annotations -> annotations
  istio-token:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  43200
  istiod-ca-cert:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      istio-ca-root-cert
    Optional:  false
  pvc-etc:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  pvc-etc-splunk-mycompany-int-standalone-0
    ReadOnly:   false
  pvc-var:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  pvc-var-splunk-mycompany-int-standalone-0
    ReadOnly:   false
  license-master:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  splunk-license-master
    Optional:    false
  mnt-splunk-defaults:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      splunk-mycompany-int-standalone-defaults
    Optional:  false
  mnt-splunk-secrets:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  splunk-mycompany-int-standalone-secret-v3
    Optional:    false
  operator-staging:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  splunk-mycompany-int-probe-configmap:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      splunk-mycompany-int-probe-configmap
    Optional:  false
  kube-api-access-45h8z:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:                      <none>
sgontla commented 1 year ago

@markusspitzli , Thanks for sharing the spec. Looking at the Pod spec, looks like the Istio is ingesting additional containers to the pod. Are you using the Istio for service mesh kind of topology? If so, this is not something we have tested and support at this time. Splunk operator has its own calculated spec for the Pod, and if Istio ingests additional containers, that may conflict the Operators spec state. I may be wrong, but my wild guess is, if you inspect the Splunk Operator logs, you may already notice that the Operator reconciling forever due to the pod spec conflict.?

As of now, Istio can be used with the Splunk Operator deployment, as part of the ingress and egress support only, and not as a service mesh topology.

markusspitzli commented 1 year ago

@sgontla That's correct. Istio uses sidecars which resides to the splunk container. We need to use this for SSL encryption between the pods and that we can ensure encryption for data in motion. It's a necessary feature demanded by our customers. Another handy feature for debugging and problem solving is the visualization of the traffic between the pods when using Kiali, which uses Istio. Therefore, supporting sidecars by the operator is important for us. I don't think it's much of an effort to integrate this in the app framework. Splunk containers are always named splunk. Just add the parameter to select the splunk container. This will work when sidecards are used and also when none are in place. I assume there may be other cases where sidecars might be used in conjunction with Splunk container.

sgontla commented 1 year ago

@markusspitzli , Thanks for the feedback. This is something are also on same page on how to make it work for Istio. Just curious to understand, if you have played setting up Search Head clustering, Indexer clustering etc,. with the Istio enabled? AFAIK, App framework is not the only feature, but there are other areas of code that may hit similar issue, as the Operator is assuming only one container.

vivekr-splunk commented 1 year ago

this issue is fixed in 2.2.1 release. please use 2.2.1

philipsabri commented 9 months ago

this issue is fixed in 2.2.1 release. please use 2.2.1

@vivekr-splunk Im using 2.2.1 and I'm trying to add a sidecar but it's being overwritten. I'm not using Istio, just tried adding it manually to the statefulset. Any ideas how I can proceed?