OpenCTI-Platform / opencti

Open Cyber Threat Intelligence Platform
https://opencti.io
Other
5.16k stars 815 forks source link

[REDIS-CLUSTER] ioredis failing to connect to Redis Cluster #4557

Open MaxwellDPS opened 8 months ago

MaxwellDPS commented 8 months ago

Environment

  1. OS (where OpenCTI server runs): CENTOS 9
  2. OpenCTI version: 5.10.2
  3. OpenCTI client: (All of the above?)
  4. Other environment details: CLUSTERED & SCALED on K8s

Reproducible Steps

Steps to create the smallest reproducible scenario:

  1. Deploy the Bitnami Helm chart for Redis cluster with a password and no TLS & no network policy
  2. Use config below and validate connectivity to the Redis Pods
  3. Platform should fail to init

Expected Output

Actual Output

CONNECTING TO REDIS FROM OCTI POD

/opt/opencti # redis-cli -h section31-redis-cluster-0.section31-redis-cluster-headless -a 6z75xo586x58x96cp7
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
section31-redis-cluster-0.section31-redis-cluster-headless:6379> SET YE YEE
(error) MOVED 11334 192.168.6.142:6379
section31-redis-cluster-0.section31-redis-cluster-headless:6379> GET YEE
(error) MOVED 10759 192.168.7.179:6379
section31-redis-cluster-0.section31-redis-cluster-headless:6379> 

OCTI LOGS

{"category":"APP","error":{"context":{},"message":"Failed to refresh slots cache.","name":"ClusterAllFailedError","stack":"ClusterAllFailedError: Failed to refresh slots cache.\n    at tryNode (/opt/opencti/build/node_modules/ioredis/built/cluster/index.js:308:31)\n    at callback (/opt/opencti/build/node_modules/ioredis/built/cluster/index.js:325:21)\n    at /opt/opencti/build/node_modules/ioredis/built/cluster/index.js:662:24\n    at run (/opt/opencti/build/node_modules/ioredis/built/utils/index.js:117:22)\n    at tryCatcher (/opt/opencti/build/node_modules/standard-as-callback/built/utils.js:12:23)\n    at /opt/opencti/build/node_modules/standard-as-callback/built/index.js:33:50\n    at processTicksAndRejections (node:internal/process/task_queues:95:5)"},"level":"error","message":"[REDIS] Redis 'base' client error","timestamp":"2023-10-13T18:25:57.587Z","version":"5.10.2"}
{"category":"APP","level":"info","message":"[REDIS] Redis 'base' client closed","timestamp":"2023-10-13T18:25:57.587Z","version":"5.10.2"}
{"category":"APP","level":"info","message":"[REDIS] Redis 'base' client closed","timestamp":"2023-10-13T18:25:57.587Z","version":"5.10.2"}
{"category":"APP","level":"info","message":"[REDIS] Redis 'base' client closed","timestamp":"2023-10-13T18:25:57.587Z","version":"5.10.2"}
{"category":"APP","level":"info","message":"[REDIS] 'base' Redis client reconnecting","timestamp":"2023-10-13T18:25:57.587Z","version":"5.10.2"}

Additional information

REDIS CONFIG

REDIS__PORT=6379
REDIS__MODE=cluster
REDIS__INCLUDE_INFERENCES=false
REDIS__HOSTNAMES=["redis-cluster-0.redis-cluster-headless","redis-cluster-1.redis-cluster-headless","redis-cluster-2.redis-cluster-headless"]
REDIS__USE_SSL=false
REDIS__TRIMMING=2000000

Screenshots (optional)

SamuelHassine commented 8 months ago

This is an exception within ioredis and not the platform, Redis cluster may not be healthy enough to properly handle all the features we need.

Not sure you can put all the nodes in the REDIS__HOSTNAMES, some Redis nodes may not be able to take data.

We are currently running testing against Redis cluster we will let you know.

MaxwellDPS commented 8 months ago

Hey @SamuelHassine! Only using the "master" redis nodes in the REDIS__HOSTNAMES so in the cluster of 6 thats 0,1,2 Heres the Bitnami Redis Conf for ref

global:
  imageRegistry:
  ## E.g.
  ## imagePullSecrets:
  ##   - myRegistryKeySecretName
  ##
  imagePullSecrets: []
  storageClass: ""
  redis:
    password: ""

image:
  registry:
  repository: bitnami/redis-cluster
  tag: 7.2.1-debian-11-r26
  digest: ""
  pullPolicy: IfNotPresent
  pullSecrets: []
  debug: false

networkPolicy:
  enabled: false
  allowExternal: true

serviceAccount:
  create: true

rbac:
  create: false

podSecurityContext:
  enabled: true
  fsGroup: 1001
  runAsUser: 1001

containerSecurityContext:
  enabled: true
  runAsUser: 1001
  runAsNonRoot: true

usePassword: true

password: ""
existingSecret: "redis-secrets"
existingSecretPasswordKey: "password"
usePasswordFile: false

tls:
  enabled: false

service:
  ports:
    redis: 6379
  nodePorts:
    redis: ""

  extraPorts: []
  type: ClusterIP
  externalTrafficPolicy: Cluster
  sessionAffinity: None

persistence:
  enabled: true
  path: /bitnami/redis/data
  accessModes:
    - ReadWriteOnce
  size: 8Gi

persistentVolumeClaimRetentionPolicy:
  enabled: false
  whenScaled: Retain
  whenDeleted: Delete

redis:
  updateStrategy:
    type: RollingUpdate
    rollingUpdate:
      partition: 0
  useAOFPersistence: "yes"
  containerPorts:
    redis: 6379
    bus: 16379

  resources:
    resources: 
      limits:
        memory: 4G
        cpu: "1"
      requests:
        memory: 256Mi
        cpu: "250m"

cluster:
  init: true
  nodes: 6
  replicas: 1
MaxwellDPS commented 5 months ago

Any update here?

MaxwellDPS commented 4 months ago

@SamuelHassine @richard-julien Any update here?