apache / solr-operator

Official Kubernetes operator for Apache Solr
https://solr.apache.org/operator
Apache License 2.0
243 stars 112 forks source link

/var/solr/data with incorrect ownership : Pod not able to write #662

Closed brickpattern closed 7 months ago

brickpattern commented 7 months ago

Setup: AWS EKS : latest Solr Operator : 11 Solrcloud : 9.3

Spec in solr values.yaml


dataStorage:
  type: "ephemeral"
  capacity: "1000Gi"
  ephemeral: {}
    # emptyDir: {}
    # hostPath: {}

the request is for 1000G ... looking from inside the pod, the /var/solr/data directory gets into root FS which is 20G

lsblk
NAME          MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
nvme1n1       259:0    0 558.8G  0 disk 
nvme0n1       259:1    0    20G  0 disk 
├─nvme0n1p1   259:3    0    20G  0 part /var/solr/data
│                                       /var/solr
│                                       /etc/resolv.conf
│                                       /etc/hostname
│                                       /dev/termination-log
│                                       /etc/hosts
└─nvme0n1p128 259:4    0     1M  0 part 

i can XFS combine the nvme as one.

Documentation says to use hostPath. just checking if any other directive can be accomplished from within the solrcloud values.yaml ? What approach should i do to have the /var/solr/data to be mounted on that drive? Any thoughts...

HoustonPutman commented 7 months ago

Can you show kubectl describe pod ....? The default capacity using the helm chart is 20Gi, so it's suspicious that you are seeing that number.

brickpattern commented 7 months ago

sure... output below


Name:             solr-solrcloud-0
Namespace:        solr
Priority:         0
Service Account:  solr-operator
Node:            ***masked***
Start Time:       Thu, 30 Nov 2023 14:07:55 -0600
Labels:           app.kubernetes.io/instance=solr
                  app.kubernetes.io/managed-by=Helm
                  app.kubernetes.io/name=solr
                  app.kubernetes.io/version=8.11.1
                  controller-revision-hash=solr-solrcloud-6c988c6c9d
                  helm.sh/chart=solr-0.7.1
                  solr-cloud=solr
                  statefulset.kubernetes.io/pod-name=solr-solrcloud-0
                  technology=solr-cloud
Annotations:      solr.apache.org/solrXmlMd5: 5fe99d590bc63efc3caa743ca939aa5a
Status:           Running
IP:               x.x.x.x
IPs:
  IP:           x.x.x.x
Controlled By:  StatefulSet/solr-solrcloud
Init Containers:
  cp-solr-xml:
    Container ID:  containerd://e091ce2e416e7b7f4f1687fab3f3bce10581780bdbc598dec1caf1784c1241b2
    Image:         library/busybox:1.28.0-glibc
    Image ID:      docker.io/library/busybox@sha256:0b55a30394294ab23b9afd58fab94e61a923f5834fba7ddbae7f8e0c11ba85e6
    Port:          <none>
    Host Port:     <none>
    Command:
      sh
      -c
      cp /tmp/solr.xml /tmp-config/solr.xml
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Thu, 30 Nov 2023 14:07:57 -0600
      Finished:     Thu, 30 Nov 2023 14:07:57 -0600
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     50e-3
      memory:  50M
    Requests:
      cpu:        50e-3
      memory:     50M
    Environment:  <none>
    Mounts:
      /tmp from solr-xml (rw)
      /tmp-config from data (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-m8g4x (ro)
Containers:
  solrcloud-node:
    Container ID:   containerd://e6147d82deb08b6a2152e35475bfc547a12ab594e79121978b9c588d1092db62
    Image:          ***masked***
    Image ID:       ***masked***
    Port:           8983/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Thu, 30 Nov 2023 14:08:22 -0600
    Ready:          True
    Restart Count:  0
    Liveness:       http-get http://:8983/solr/admin/info/system delay=0s timeout=1s period=20s #success=1 #failure=3
    Readiness:      http-get http://:8983/solr/admin/info/health delay=0s timeout=1s period=10s #success=1 #failure=2
    Startup:        http-get http://:8983/solr/admin/info/system delay=10s timeout=1s period=5s #success=1 #failure=10
    Environment:
      SOLR_JAVA_MEM:        -Xms8192m -Xmx16384m
      SOLR_HOME:            /var/solr/data
      SOLR_PORT:            8983
      SOLR_NODE_PORT:       80
      SOLR_PORT_ADVERTISE:  80
      POD_HOSTNAME:         solr-solrcloud-0 (v1:metadata.name)
      POD_NAME:             solr-solrcloud-0 (v1:metadata.name)
      POD_IP:                (v1:status.podIP)
      POD_NAMESPACE:        solr (v1:metadata.namespace)
      SOLR_HOST:            $(POD_NAME).solr
      SOLR_LOG_LEVEL:       INFO
      GC_TUNE:              
      SOLR_STOP_WAIT:       55
      ZK_HOST:              solr-solrcloud-zookeeper-0.solr-solrcloud-zookeeper-headless.solr.svc.cluster.local:2181,solr-solrcloud-zookeeper-1.solr-solrcloud-zookeeper-headless.solr.svc.cluster.local:2181,solr-solrcloud-zookeeper-2.solr-solrcloud-zookeeper-headless.solr.svc.cluster.local:2181/
      ZK_CHROOT:            /
      ZK_SERVER:            solr-solrcloud-zookeeper-0.solr-solrcloud-zookeeper-headless.solr.svc.cluster.local:2181,solr-solrcloud-zookeeper-1.solr-solrcloud-zookeeper-headless.solr.svc.cluster.local:2181,solr-solrcloud-zookeeper-2.solr-solrcloud-zookeeper-headless.solr.svc.cluster.local:2181
      SOLR_OPTS:            -DhostPort=$(SOLR_NODE_PORT) -Denable.runtime.lib=true -Denable.packages=true
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-m8g4x (ro)
      /var/solr/data from data (rw)
Readiness Gates:
  Type                                 Status
  solr.apache.org/isNotStopped         True 
  solr.apache.org/replicasNotEvicted   True 
Conditions:
  Type                                 Status
  solr.apache.org/isNotStopped         True 
  solr.apache.org/replicasNotEvicted   True 
  Initialized                          True 
  Ready                                True 
  ContainersReady                      True 
  PodScheduled                         True 
Volumes:
  solr-xml:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      solr-solrcloud-configmap
    Optional:  false
  data:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  1000Gi
  kube-api-access-m8g4x:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           ***masked***.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
                             role=solr-cluster:NoSchedule
brickpattern commented 7 months ago

i moved from ephemeral to persistent volume to resolve this. only PV option has the hostpath in the spec/values.yaml file.

however moving to persistent volume created a FS permission issue. the pod with mount volume. has /var/solr under solr:solr where as /var/solr/data is under root. Pod is not able to write to the data directory.

$ cd /var/
$ ls -ltar total 0 drwxrwsr-x 2 root staff 6 Apr 18 2022 local drwxr-xr-x 2 root root 6 Apr 18 2022 backups drwxr-xr-x 2 root root 18 Oct 4 02:08 spool lrwxrwxrwx 1 root root 4 Oct 4 02:08 run -> /run drwxr-xr-x 2 root root 6 Oct 4 02:08 opt drwxrwsr-x 2 root mail 6 Oct 4 02:08 mail lrwxrwxrwx 1 root root 9 Oct 4 02:08 lock -> /run/lock drwxrwxrwt 2 root root 6 Oct 4 02:12 tmp drwxr-xr-x 1 root root 29 Oct 30 23:27 lib drwxr-xr-x 1 root root 57 Oct 30 23:27 log drwxr-xr-x 1 root root 48 Oct 30 23:27 cache drwxr-xr-x 1 root root 41 Oct 31 03:15 . drwxr-xr-x 1 root root 28 Dec 1 03:21 .. drwxrwx--- 4 solr solr 48 Dec 1 03:21 solr $ cd solr $ ls -ltar total 8 -rw-r--r-- 1 solr solr 3931 Oct 11 02:11 log4j2.xml drwxr-xr-x 1 root root 41 Oct 31 03:15 .. drwxr-xr-x 3 root root 4096 Dec 1 03:21 data drwxrwx--- 4 solr solr 48 Dec 1 03:21 . drwxrwx--- 2 solr solr 101 Dec 1 03:22 logs

brickpattern commented 7 months ago

Is this a bug or a way to override the ownership of the FS /var/solr/data?

brickpattern commented 6 months ago

I reinstalled the chart.

closing the ticket.