ansible / awx-operator

An Ansible AWX operator for Kubernetes built with Operator SDK and Ansible. 🤖
https://www.github.com/ansible/awx
Apache License 2.0
1.19k stars 602 forks source link

"Permission denied: '/var/lib/awx/projects" when running two replicas #1176

Open joyartoun opened 1 year ago

joyartoun commented 1 year ago

Please confirm the following

Bug Summary

Hello all,

I am using this operator deployed via helm chart on OKD. I have noticed that if we are running two replicas (or more) we are getting error "Permission denied: '/var/lib/awx/projects" in the logs of the awx-web container.

We are using rook-ceph as storage and are using the filesystem storageclass with readwritemany for the pvc created for the projects folder.

This is the CR manifest

apiVersion: awx.ansible.com/v1beta1
kind: AWX
metadata:
  name: awx
  namespace: awx
spec:
  route_host: <redacted>
  create_preload_data: true
  route_tls_termination_mechanism: Edge
  garbage_collect_secrets: false
  ingress_type: route
  loadbalancer_port: 80
  image_pull_policy: IfNotPresent
  projects_storage_size: 20Gi
  projects_storage_access_mode: ReadWriteMany
  projects_persistence: true
  projects_storage_class: ceph-filesystem
  replicas: 1
  admin_user: admin
  loadbalancer_protocol: http
  nodeport_port: 30080
  task_privileged: true
  postgres_storage_requirements:
    requests:
      storage: 8Gi
    limits:
      storage: 50Gi
  postgres_storage_class: ceph-block

This is the pvc manifest

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  namespace: awx
  ownerReferences:
    - apiVersion: awx.ansible.com/v1beta1
      kind: AWX
      name: awx
      uid: a4993691-d509-41a8-a009-0202b9cc8374
  finalizers:
    - kubernetes.io/pvc-protection
  labels:
    app.kubernetes.io/component: awx
    app.kubernetes.io/managed-by: awx-operator
    app.kubernetes.io/name: awx
    app.kubernetes.io/operator-version: 1.1.3
    app.kubernetes.io/part-of: awx
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 20Gi
  volumeName: pvc-a473d689-7bac-4841-8b21-4180a9a13e7d
  storageClassName: ceph-filesystem
  volumeMode: Filesystem
status:
  phase: Bound
  accessModes:
    - ReadWriteMany
  capacity:
    storage: 20Gi

AWX Operator version

chart version 1.1.3

AWX version

AWX 21.10.2

Kubernetes platform

openshift

Kubernetes/Platform version

4.11

Modifications

no

Steps to reproduce

Install helm chart version 1.1.3 with the following manifest on openshift. Be sure to use a storageclass with readwritemany capabilities

apiVersion: awx.ansible.com/v1beta1
kind: AWX
metadata:
  name: awx
  namespace: awx
spec:
  route_host: <redacted>
  create_preload_data: true
  route_tls_termination_mechanism: Edge
  garbage_collect_secrets: false
  ingress_type: route
  loadbalancer_port: 80
  image_pull_policy: IfNotPresent
  projects_storage_size: 20Gi
  projects_storage_access_mode: ReadWriteMany
  projects_persistence: true
  projects_storage_class: ceph-filesystem
  replicas: 2
  admin_user: admin
  loadbalancer_protocol: http
  nodeport_port: 30080
  task_privileged: true
  postgres_storage_requirements:
    requests:
      storage: 8Gi
    limits:
      storage: 50Gi
  postgres_storage_class: ceph-block

Expected results

Being able to run more than 1 replica of the awx pod.

Actual results

Only one pod at a time is able to access /var/lib/awx/projects resulting in a inconsistent behaviour of the application.

Additional information

No response

Operator Logs

No response

djyasin commented 1 year ago

Hello @joyartoun, We need to do some additional investigation here.