Closed cyron closed 7 months ago
Hello,
Not sure if it's the same root cause but I also had an issue to deploy because of the mongo never getting ready and the pod getting restarted.
For me, the issue was coming for the readiness and liveness probe never succeeding.
It was related to https://github.com/bitnami/charts/issues/10264.
It seem there is a issue with the older version of mongosh
.
In the current release of litmus, mongo is using the image: bitnami/mongodb:5.0.8-debian-10-r24
which include mongosh
with the version 1.4.2
.
In the link https://www.mongodb.com/community/forums/t/mongosh-eval-freezes-the-shell/121406/12 the last post, it said the issue happen for all version of mongosh
under 1.6.1
.
It also describe 3 solution. I used the last one which is to upgrade to a new version of mongosh. I currently use the tag: 5.0.23-debian-11-r7
I tryed with the proposed image with no luck.
My env is k8s (rke2) 1.26.8 Chart version 3.1.0 I'm behind an enterprise proxy. Here are the values used:
# Default values for litmus.
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.
portalScope: cluster
nameOverride: ""
# -- Additional labels
customLabels: {}
# my.company.com/concourse-cd: 2
# -- Use existing secret (e.g., External Secrets)
existingSecret: ""
adminConfig:
JWTSecret: "litmus-portal@123"
VERSION: "3.1.0"
SKIP_SSL_VERIFY: "false"
# -- leave empty if uses Mongo DB deployed by this chart
DBPASSWORD: ""
DBUSER: ""
DB_SERVER: "mongodb://litmus-prod01-mongodb-headless"
DB_PORT: ""
ADMIN_USERNAME: "admin"
ADMIN_PASSWORD: "litmus"
image:
imageRegistryName: litmuschaos.docker.scarf.sh/litmuschaos
# Optional pod imagePullSecrets
imagePullSecrets: []
ingress:
enabled: true
name: litmus-ingress
annotations:
traefik.ingress.kubernetes.io/router.entrypoints: web,websecure
kubernetes.io/ingress.class: traefik-csi
# kubernetes.io/tls-acme: "true"
# nginx.ingress.kubernetes.io/rewrite-target: /$1
ingressClassName: ""
host:
# -- This is ingress hostname (ex: my-domain.com)
name: "litmus.k8s-csi-prod01.nivolapiemonte.it"
frontend:
# -- You may need adapt the path depending your ingress-controller
path: /(.*)
# -- Allow to set [pathType](https://kubernetes.io/docs/concepts/services-networking/ingress/#path-types) for the frontend path
pathType: ImplementationSpecific
backend:
# -- You may need adapt the path depending your ingress-controller
path: /backend/(.*)
# -- Allow to set [pathType](https://kubernetes.io/docs/concepts/services-networking/ingress/#path-types) for the backend path
pathType: ImplementationSpecific
tls: []
# - secretName: chart-example-tls
# hosts: []
upgradeAgent:
enabled: true
controlPlane:
image:
repository: upgrade-agent-cp
tag: "3.1.0"
pullPolicy: "Always"
restartPolicy: OnFailure
nodeSelector: {}
tolerations: []
affinity: {}
resources: {}
# We usually recommend not to specify default resources and to leave this as a conscious
# choice for the user. This also increases chances charts run on environments with little
# resources, such as Minikube. If you do want to specify resources, uncomment the following
# lines, adjust them as necessary, and remove the curly braces after 'resources:'.
# limits:
# cpu: 100m
# memory: 128Mi
# requests:
# cpu: 100m
# memory: 128Mi
portal:
frontend:
replicas: 1
autoscaling:
enabled: false
minReplicas: 2
maxReplicas: 3
targetCPUUtilizationPercentage: 50
targetMemoryUtilizationPercentage: 50
updateStrategy: {}
## Strategy for deployment updates.
##
## Example:
##
## strategy:
## type: RollingUpdate
## rollingUpdate:
## maxSurge: 1
## maxUnavailable: 25%
automountServiceAccountToken: false
# securityContext:
# runAsUser: 2000
# allowPrivilegeEscalation: false
# runAsNonRoot: true
image:
repository: litmusportal-frontend
tag: 3.1.0
pullPolicy: "Always"
containerPort: 8185
customLabels: {}
# my.company.com/tier: "frontend"
resources:
# We usually recommend not to specify default resources and to leave this as a conscious
# choice for the user. This also increases chances charts run on environments with little
# resources, such as Minikube. If you do want to specify resources, uncomment the following
# lines, adjust them as necessary, and remove the curly braces after 'resources:'.
requests:
memory: "150Mi"
cpu: "125m"
ephemeral-storage: "500Mi"
limits:
memory: "512Mi"
cpu: "550m"
ephemeral-storage: "1Gi"
livenessProbe:
failureThreshold: 5
initialDelaySeconds: 30
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
readinessProbe:
initialDelaySeconds: 5
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
service:
annotations: {}
type: ClusterIP
port: 9091
targetPort: 8185
virtualService:
enabled: false
hosts: []
gateways: []
pathPrefixEnabled: false
nodeSelector: {}
tolerations: []
affinity: {}
server:
replicas: 1
updateStrategy: {}
## Strategy for deployment updates.
##
## Example:
##
## strategy:
## type: RollingUpdate
## rollingUpdate:
## maxSurge: 1
## maxUnavailable: 25%
serviceAccountName: litmus-server-account
customLabels: {}
# my.company.com/tier: "backend"
waitForMongodb:
image:
repository: mongo
tag: 6
pullPolicy: "Always"
securityContext:
{}
# runAsUser: 101
# allowPrivilegeEscalation: false
# runAsNonRoot: true
# readOnlyRootFilesystem: true
resources:
# We usually recommend not to specify default resources and to leave this as a conscious
# choice for the user. This also increases chances charts run on environments with little
# resources, such as Minikube. If you do want to specify resources, uncomment the following
# lines, adjust them as necessary, and remove the curly braces after 'resources:'.
requests:
memory: "150Mi"
cpu: "25m"
ephemeral-storage: "500Mi"
limits:
memory: "512Mi"
cpu: "250m"
ephemeral-storage: "1Gi"
graphqlServer:
volumes:
- name: gitops-storage
emptyDir: {}
- name: hub-storage
emptyDir: {}
volumeMounts:
- mountPath: /tmp/
name: gitops-storage
- mountPath: /tmp/version
name: hub-storage
securityContext:
runAsUser: 2000
allowPrivilegeEscalation: false
runAsNonRoot: true
readOnlyRootFilesystem: true
image:
repository: litmusportal-server
tag: 3.1.0
pullPolicy: "Always"
ports:
- name: gql-server
containerPort: 8080
- name: gql-rpc-server
containerPort: 8000
service:
annotations: {}
type: ClusterIP
graphqlServer:
port: 9002
targetPort: 8080
graphqlRpcServer:
port: 8000
targetPort: 8000
imageEnv:
SUBSCRIBER_IMAGE: "litmusportal-subscriber:3.1.0"
EVENT_TRACKER_IMAGE: "litmusportal-event-tracker:3.1.0"
ARGO_WORKFLOW_CONTROLLER_IMAGE: "workflow-controller:v3.3.1"
ARGO_WORKFLOW_EXECUTOR_IMAGE: "argoexec:v3.3.1"
LITMUS_CHAOS_OPERATOR_IMAGE: "chaos-operator:3.1.0"
LITMUS_CHAOS_RUNNER_IMAGE: "chaos-runner:3.1.0"
LITMUS_CHAOS_EXPORTER_IMAGE: "chaos-exporter:3.1.0"
genericEnv:
TLS_SECRET_NAME: ""
TLS_CERT_64: ""
CONTAINER_RUNTIME_EXECUTOR: "k8sapi"
DEFAULT_HUB_BRANCH_NAME: "v3.1.x"
INFRA_DEPLOYMENTS: '["app=chaos-exporter", "name=chaos-operator", "app=event-tracker", "app=workflow-controller"]'
LITMUS_AUTH_GRPC_PORT: ":3030"
WORKFLOW_HELPER_IMAGE_VERSION: "3.1.0"
REMOTE_HUB_MAX_SIZE: "5000000"
INFRA_COMPATIBLE_VERSIONS: '["3.1.0"]'
# Provide UI endpoint if using namespaced scope
CHAOS_CENTER_UI_ENDPOINT: ""
resources:
# We usually recommend not to specify default resources and to leave this as a conscious
# choice for the user. This also increases chances charts run on environments with little
# resources, such as Minikube. If you do want to specify resources, uncomment the following
# lines, adjust them as necessary, and remove the curly braces after 'resources:'.
requests:
memory: "250Mi"
cpu: "225m"
ephemeral-storage: "500Mi"
limits:
memory: "712Mi"
cpu: "550m"
ephemeral-storage: "1Gi"
livenessProbe:
failureThreshold: 5
initialDelaySeconds: 30
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
readinessProbe:
initialDelaySeconds: 5
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
authServer:
replicas: 1
autoscaling:
enabled: false
minReplicas: 2
maxReplicas: 3
targetCPUUtilizationPercentage: 50
targetMemoryUtilizationPercentage: 50
securityContext:
runAsUser: 2000
allowPrivilegeEscalation: false
runAsNonRoot: true
readOnlyRootFilesystem: true
automountServiceAccountToken: false
image:
repository: litmusportal-auth-server
tag: 3.1.0
pullPolicy: "Always"
ports:
- name: auth-server
containerPort: 3030
- name: auth-rpc-server
containerPort: 3000
service:
annotations: {}
type: ClusterIP
authServer:
port: 9003
targetPort: 3000
authRpcServer:
port: 3030
targetPort: 3030
env:
LITMUS_GQL_GRPC_PORT: ":8000"
resources:
# We usually recommend not to specify default resources and to leave this as a conscious
# choice for the user. This also increases chances charts run on environments with little
# resources, such as Minikube. If you do want to specify resources, uncomment the following
# lines, adjust them as necessary, and remove the curly braces after 'resources:'.
requests:
memory: "250Mi"
cpu: "225m"
ephemeral-storage: "500Mi"
limits:
memory: "712Mi"
cpu: "550m"
ephemeral-storage: "1Gi"
volumeMounts: []
volumes: []
nodeSelector: {}
tolerations: []
affinity: {}
# -- Configure the Bitnami MongoDB subchart
# see values at https://github.com/bitnami/charts/blob/master/bitnami/mongodb/values.yaml
mongodb:
enabled: true
image:
debug: true
#tag: 5.0.8-debian-10-r24
# TODO changed the tag as per post on litmus issue
tag: 5.0.23-debian-11-r7
auth:
enabled: true
rootUser: "root"
rootPassword: "1234"
replicaSetKey: Blablablablba
# -- existingSecret Existing secret with MongoDB(®) credentials (keys: `mongodb-passwords`, `mongodb-root-password`, `mongodb-metrics-password`, ` mongodb-replica-set-key`)
existingSecret: ""
architecture: replicaset
replicaCount: 3
persistence:
enabled: true
storageClass: "csi-storage-nas"
volumePermissions:
enabled: true
metrics:
enabled: false
prometheusRule:
enabled: false
customStartupProbe:
initialDelaySeconds: 5
periodSeconds: 20
timeoutSeconds: 10
successThreshold: 1
failureThreshold: 30
exec:
command:
- sh
- -c
- |
mongosh --nodb --eval "disableTelemetry()"
/bitnami/scripts/startup-probe.sh
arbiter:
customStartupProbe:
initialDelaySeconds: 5
periodSeconds: 20
timeoutSeconds: 10
successThreshold: 1
failureThreshold: 30
exec:
command:
- sh
- -c
- |
mongosh --nodb --eval "disableTelemetry()"
# /bitnami/scripts/startup-probe.sh
Logs from the arbiter:
mongodb 08:50:42.29 INFO ==>
2024-03-15T09:50:42.295159739+01:00 mongodb 08:50:42.29 INFO ==> Welcome to the Bitnami mongodb container
2024-03-15T09:50:42.296505375+01:00 mongodb 08:50:42.29 INFO ==> Subscribe to project updates by watching https://github.com/bitnami/containers
2024-03-15T09:50:42.297895529+01:00 mongodb 08:50:42.29 INFO ==> Submit issues and feature requests at https://github.com/bitnami/containers/issues
2024-03-15T09:50:42.299243104+01:00 mongodb 08:50:42.29 INFO ==>
mongodb 08:50:42.30 INFO ==> ** Starting MongoDB setup **
2024-03-15T09:50:42.315436143+01:00 mongodb 08:50:42.31 INFO ==> Validating settings in MONGODB_* env vars...
2024-03-15T09:50:42.352489000+01:00 mongodb 08:50:42.35 INFO ==> Initializing MongoDB...
2024-03-15T09:50:42.391994260+01:00 mongodb 08:50:42.39 INFO ==> Writing keyfile for replica set authentication...
2024-03-15T09:50:42.400499164+01:00 mongodb 08:50:42.40 INFO ==> Deploying MongoDB from scratch...
mongodb 08:50:42.40 DEBUG ==> Starting MongoDB in background...
2024-03-15T09:50:42.429154640+01:00 about to fork child process, waiting until server is ready for connections.
2024-03-15T09:50:42.430277593+01:00 forked process: 52
2024-03-15T09:50:43.042899555+01:00 child process started successfully, parent exiting
2024-03-15T09:50:43.510483293+01:00 MongoNetworkError: connect ECONNREFUSED 10.42.5.195:27017
2024-03-15T09:50:43.519287180+01:00 mongodb 08:50:43.51 INFO ==> Creating users...
mongodb 08:50:43.52 INFO ==> Users created
2024-03-15T09:50:43.538146600+01:00 mongodb 08:50:43.53 INFO ==> Configuring MongoDB replica set...
2024-03-15T09:50:43.544561571+01:00 mongodb 08:50:43.54 INFO ==> Stopping MongoDB...
2024-03-15T09:50:44.554157529+01:00 mongodb 08:50:44.55 DEBUG ==> Starting MongoDB in background...
2024-03-15T09:50:44.580869195+01:00 about to fork child process, waiting until server is ready for connections.
2024-03-15T09:50:44.582002932+01:00 forked process: 140
child process started successfully, parent exiting
2024-03-15T09:50:46.542079135+01:00 mongodb 08:50:46.54 DEBUG ==> Waiting for primary node...
2024-03-15T09:50:46.543715214+01:00 mongodb 08:50:46.54 DEBUG ==> Waiting for primary node...
2024-03-15T09:50:46.545308003+01:00 mongodb 08:50:46.54 INFO ==> Trying to connect to MongoDB server litmus-prod01-mongodb-0.litmus-prod01-mongodb-headless.litmus.svc.cluster.local...
2024-03-15T09:50:46.551996683+01:00 mongodb 08:50:46.55 INFO ==> Found MongoDB server listening at litmus-prod01-mongodb-0.litmus-prod01-mongodb-headless.litmus.svc.cluster.local:27017 !
2024-03-15T09:51:27.409527026+01:00 mongodb 08:51:27.40 ERROR ==> Node litmus-prod01-mongodb-0.litmus-prod01-mongodb-headless.litmus.svc.cluster.local did not become available
2024-03-15T09:51:27.414338512+01:00 mongodb 08:51:27.41 INFO ==> Stopping MongoDB...
I need some advice or help on litmus installation.
Fresh install of litmus and mongo does not ready.
and mongo logs many error.
and this is ths values.yaml.