dragonflydb / dragonfly

A modern replacement for Redis and Memcached
https://www.dragonflydb.io/
Other
25.82k stars 948 forks source link

Helm-Chart #57

Closed Hermsi1337 closed 2 years ago

Hermsi1337 commented 2 years ago

Hey Folks,

I just stumbled across this insane project - great work so far! 🚀

I was wondering if you guys are already working on a helm-Chart in order to deploy dragonfly to Kubernetes? It would be a lot easier for us to test dragonfly in our environment then.

Best regards!

romange commented 2 years ago

Hi @Hermsi1337 , I personally have not run in on K8s but I know that @zacharya19 run it there .

Care to chip in @zacharya19 ?

zacharya19 commented 2 years ago

Care to chip in @zacharya19 ?

I would love to :)

I was wondering if you guys are already working on a helm-Chart in order to deploy dragonfly to Kubernetes? It would be a lot easier for us to test dragonfly in our environment then.

@Hermsi1337 I wouldn't say it's already a work in progress, but it is probably something that would be easy to create, as at the moment the Dragonflydb is a single binary. IMHO, it might be a bit early to create a helm chart, but it would be really interesting to build a K8S operator in the future to manage clusters and backups once HA is available. Do you have any specific use-case where the helm chart would make the deployment better?

Here is my simple deployment to K8S, feel free if you have any questions:

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: dragonfly
  namespace: infra
spec:
  selector:
    matchLabels:
      app: dragonfly
  template:
    metadata:
      labels:
        app: dragonfly
    spec:
      containers:
        - name: dragonfly
          image: docker.dragonflydb.io/dragonflydb/dragonfly
          args:
            - --alsologtostderr
          resources:
            requests:
              memory: "512Mi"
              cpu: "1000m"
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: dragonfly-pdb
  namespace: infra
spec:
  minAvailable: 1
  selector:
    matchLabels:
      app: dragonfly
---
apiVersion: v1
kind: Service
metadata:
  name: dragonfly
  namespace: infra
spec:
  selector:
    app: dragonfly
  ports:
    - port: 6379
      targetPort: 6379

I also see @TamCore might be working on a helm chart, which I would be happy to review :)

tamcore commented 2 years ago

It's very minimal and certainly still needs some love (liveness + readyness probes, storage for /data, ...). If you want to have a look: https://github.com/TamCore/dragonfly/tree/helm-chart/charts/dragonfly :) But until now, the rendered output is basically identical to the Deployment provided above :)

Hermsi1337 commented 2 years ago

@Hermsi1337 Do you have any specific use-case where the helm chart would make the deployment better?

For me, in the context of Kubernetes, a helm-chart is always better when deploying.
I don't have to take care of config-related stuff (startup args, env-vars, pvcs, ...you name it).

In the end, when building applications, I want to spend the least time possible for configuration/deplyoment things. Using dragonfly with helm is 5 lines of code vs. approx 100 (or more) lines without it.

drinkbeer commented 2 years ago

Hey @zacharya19, when I am applying this yaml in a cluster in Google Kubernetes Engine, I found the following errors:

➜  ~ k logs pod/dragonfly-5b4dbf68dc-kvwkv
I20220601 02:57:38.824812     1 init.cc:56] dragonfly running in opt mode.
I20220601 02:57:38.824887     1 dfly_main.cc:179] maxmemory has not been specified. Deciding myself....
I20220601 02:57:38.824970     1 dfly_main.cc:184] Found 122.54GiB available memory. Setting maxmemory to 98.03GiB
E20220601 02:57:38.825217     9 proactor.cc:398] io_uring does not have enough memory. That can happen when your max locked memory is too limited. If you run me via docker, try adding '--ulimit memlock=-1' todocker run command

Do you know how to set '--ulimit memlock=-1' to Docker container in K8S?

tamcore commented 2 years ago

That would be needed to be done through /etc/docker/docker.json. But i fear that's not feasible in GKE (or any other managed K8s offering, especially when containerd is used), as nodes might be automatically replaced during the weekly maintenance windows.

Try including this in your deployment.yaml:

      initContainers:
        - name: increase-memlock-ulimit
          image: busybox
          command: ["sh", "-c", "ulimit -l unlimited"]
          securityContext:
            privileged: true
romange commented 2 years ago

@TamCore I see you have great docker knowledge. Do you know why on some host distributions I need to use memlock and on others I do not need? For example, my ubuntu 21.10 does not require memlock argument and it does not have /etc/docker/docker.json file. Bullseye, on the other hand requires this argument.

My suspicion is that io_uring dropped this requirement from some kernel version onward but I am not sure...

tamcore commented 2 years ago

Depending on distro / installation method the file can also be called daemon.json. But it doesn't have to exist. For example, i have a AlmaLinux 8.5 LXC container with docker-ce installed through the official docker repos and there it wasn't created by default.

But regarding the limit, i think that's just distribution specific. Ubuntu seems to set memlock to a friendly 65536, whereas something RHEL-compatible seems to set it to a conservative 64.

romange commented 2 years ago

@TamCore but if there is no config file, do you know where distro specifies this limit. Specifically, how can I decrease the limit in ubuntu so it will be 64 like for other distributions? or do you think they compile the kernel with this parameter?

tamcore commented 2 years ago

The docker config has nothing to do with the system-wide limit :) The limit itself, is passed through and changed in multiple places. What the kernel has compiled-in, you can see in /proc/1/limits. What currently applies to you, you can see with ulimit -a. And then you have things like your init-system (most likely systemd), PAM and so possibly interfering with it. If you want to make permanent changes to it, your best bet is either /etc/security/limits.conf or dropping a file in /etc/security/limits.d/*.conf.

romange commented 2 years ago

Got ya. So what's interesting is that when I limit my locally running dragonfly to 1KB of locked memory - it still runs. I will confirm with @axboe whether they removed the restriction for mlocked memory - it can greatly simplify deployment of iouring based apps.

axboe commented 2 years ago

Yes, newer kernels generally rely on memcg rather then the memlock rlimit.

romange commented 2 years ago

Thanks Jens, that was quick! Do you happen to remember starting from which version it applies to io_uring?

axboe commented 2 years ago

I'll check when I'm on the laptop.

romange commented 2 years ago

Thanks!

axboe commented 2 years ago

It happened with the 5.12 release.

drinkbeer commented 2 years ago

Try including this in your deployment.yaml:

Hey @TamCore , Thank you for your reply. I still have the same error after adding initContainers to the deployment.yaml.

romange commented 2 years ago

@drinkbeer Can you pls attach here the yml files and the gcloud command of how you deploy dragonflydb?

I will try to reproduce the issue on my gcp account.

romange commented 2 years ago

By judging from https://github.com/kubernetes/kubernetes/issues/3595 there is no straightforward way to pass rlimit to containers on K8S.

Note: I found this article that may provide a way to walk around the problem.

Unfortunately I do not have much knowledge about linux capabilities. @axboe Jens, do you think that a process with a low memlock limit but with with IPC_LOCK cap will be able to run io_uring on 5.10 kernel?

laupse commented 2 years ago

For information i did deploy it on k8s cluster running on Ubuntu 22.04 and it works without having to change the limit (initContainers seems to not have any effect). I guess it is because of the newer kernels that came with this distro version.

axboe commented 2 years ago

If IPC_LOCK cap is set, then no accounting is done. So yes, it should work fine with 5.10 without having to fiddle with the rlimits for that case.

romange commented 2 years ago

Thanks Jens. So this will probably be the way to improve K8S integration.

@laupse - yeah - 22.04 is 5.15 and it does not pose ulimit restrictions on dragonfly. My worry is about folks that use default K8S images like COS which uses 5.10.

drinkbeer commented 2 years ago

@drinkbeer Can you pls attach here the yml files and the gcloud command of how you deploy dragonflydb?

I will try to reproduce the issue on my gcp account.

Hey @romange , Thank you for your reply!

This is my yaml file: https://gist.github.com/drinkbeer/d753e51fca6f6221928a0b95a7a520f4 I run k --context <my_context> -n <my_namespace> apply -f ~/Documents/deployment-dragonfly.yaml to apply it in GKE 1.22 (kernel 5.10+).

I talked with Google guys, seems like the /usr/lib/systemd/system/docker.service file is read-only in the Container-Optimized OS.

Then I provisioned a "Debian GNU/Linux 11 (bullseye)" in Google Compute Engine, and install K8S in it. I saw the same error from Dragonfly: io_uring does not have enough memory. That can happen when your max locked memory is too limited. If you run me via docker, try adding '--ulimit memlock=-1' todocker run command. Debian 11 is using Kernel 5.10:

jc@master-node:~/deployment-files$ k describe node/master-node | grep Kernel
  Kernel Version:             5.10.0-14-cloud-amd64

I will try Ubuntu 22.04 as @laupse said, and update the results here.

romange commented 2 years ago

I will try to fix it for COS and Bullseye. This will require a container rebuild.

On Thu, Jun 2, 2022 at 11:25 AM Jianbin Chen @.***> wrote:

@drinkbeer https://github.com/drinkbeer Can you pls attach here the yml files and the gcloud command of how you deploy dragonflydb?

I will try to reproduce the issue on my gcp account.

This is my yaml file: https://gist.github.com/drinkbeer/d753e51fca6f6221928a0b95a7a520f4 I run k --context -n apply -f ~/Documents/deployment-dragonfly.yaml to apply it in GKE 1.22 (kernel 5.10+).

I talked with Google guys, seems like the /usr/lib/systemd/system/docker.service file is read-only in the Container-Optimized OS.

Then I provisioned a "Debian GNU/Linux 11 (bullseye)" in Google Compute Engine, I saw the same error as GKE: io_uring does not have enough memory. That can happen when your max locked memory is too limited. If you run me via docker, try adding '--ulimit memlock=-1' todocker run command. Debian 11 is using Kernel 5.10:

@.***:~/deployment-files$ k describe node/master-node | grep Kernel Kernel Version: 5.10.0-14-cloud-amd64

I will try Ubuntu 22.04 as @laupse https://github.com/laupse, and update the results here.

— Reply to this email directly, view it on GitHub https://github.com/dragonflydb/dragonfly/issues/57#issuecomment-1144589673, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA4BFCBMAN52Z4KGNE47MTTVNBVY7ANCNFSM5XMS6OLQ . You are receiving this because you commented.Message ID: @.***>

-- Best regards, Roman

romange commented 2 years ago

@drinkbeer I created a trial image with cap_ipc_lock capability. Can you try it out with COS image 5.10? You will need the following changes to your yml:

      containers:
        - name: dragonfly
          image: ghcr.io/dragonflydb/dragonfly:unlocked
          args: ["--alsologtostderr"]          
          securityContext:
            runAsNonRoot: true
            runAsUser: 999
            capabilities:
              add: ["IPC_LOCK"]
drinkbeer commented 2 years ago

@drinkbeer I created a trial image with cap_ipc_lock capability. Can you try it out with COS image 5.10? You will need the following changes to your yml:

      containers:
        - name: dragonfly
          image: ghcr.io/dragonflydb/dragonfly:unlocked
          args: ["--alsologtostderr"]          
          securityContext:
            runAsNonRoot: true
            runAsUser: 999
            capabilities:
              add: ["IPC_LOCK"]

Hey @romange , I would like to confirm that it works in my GKE 1.22 cluster (nodes running with Container-Optimized OS, pods running with Ubuntu 20.04).

Here is the complete yaml file I used:

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: dragonfly
  namespace: jason-test
spec:
  selector:
    matchLabels:
      app: dragonfly
  template:
    metadata:
      labels:
        app: dragonfly
    spec:
      containers:
        - name: dragonfly
          image: ghcr.io/dragonflydb/dragonfly:unlocked
          args: ["--alsologtostderr"]          
          securityContext:
            runAsNonRoot: true
            runAsUser: 999
            capabilities:
              add: ["IPC_LOCK"]
          resources:
            limits:
              cpu: "6"
              memory: 16000Mi
            requests:
              cpu: "2"
              memory: 12000Mi
      tolerations:
      - effect: NoExecute
        key: app
        operator: Equal
        value: vecache
      - effect: NoExecute
        key: node.kubernetes.io/not-ready
        operator: Exists
        tolerationSeconds: 30
      - effect: NoExecute
        key: node.kubernetes.io/unreachable
        operator: Exists
        tolerationSeconds: 30
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: dragonfly-pdb
  namespace: jason-test
spec:
  minAvailable: 1
  selector:
    matchLabels:
      app: dragonfly
---
apiVersion: v1
kind: Service
metadata:
  name: dragonfly
  namespace: jason-test
spec:
  selector:
    app: dragonfly
  ports:
    - port: 6379
      targetPort: 6379

Node:

jchome@gke-staging-cq-state-us-eas-vecache-1-a4098099-jyh2 ~ $ cat /etc/os-release
NAME="Container-Optimized OS"
ID=cos
PRETTY_NAME="Container-Optimized OS from Google"
HOME_URL="https://cloud.google.com/container-optimized-os/docs"
BUG_REPORT_URL="https://cloud.google.com/container-optimized-os/docs/resources/support-policy#contact_us"
GOOGLE_CRASH_ID=Lakitu
GOOGLE_METRICS_PRODUCT_ID=26
KERNEL_COMMIT_ID=9b0572c38f1c8b1d43f041fbb5c87dee0d2e22d2
VERSION=93
VERSION_ID=93
BUILD_ID=16623.171.10

Pod:

➜  ~ k exec dragonfly-644c86468c-pkdvp -it -- sh
$ cat /etc/os-release
NAME="Ubuntu"
VERSION="20.04.4 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.4 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal

Running and validate:

➜  ~ kgp
NAME                         READY   STATUS    RESTARTS       AGE     IP            NODE                                                  NOMINATED NODE   READINESS GATES
dragonfly-644c86468c-pkdvp   1/1     Running   0              3m11s   10.49.56.36   gke-staging-cq-state-us-eas-vecache-1-a4098099-jyh2   <none>           <none>
memtier-76c4c75494-v7nrp     1/1     Running   15 (78m ago)   34h     10.49.56.19   gke-staging-cq-state-us-eas-vecache-1-a4098099-jyh2   <none>           <none>
vecache-0                    1/1     Running   0              34h     10.49.6.18    gke-staging-cq-state-us-eas-vecache-1-a4098099-w0vx   <none>           <none>
➜  ~ k logs pod/dragonfly-644c86468c-pkdvp
I20220602 16:19:18.769198     1 init.cc:56] dragonfly running in opt mode.
I20220602 16:19:18.769333     1 dfly_main.cc:179] maxmemory has not been specified. Deciding myself....
I20220602 16:19:18.769438     1 dfly_main.cc:184] Found 122.48GiB available memory. Setting maxmemory to 97.99GiB
I20220602 16:19:18.769925     8 proactor.cc:428] IORing with 1024 entries, allocated 102720 bytes, cq_entries is 2048
I20220602 16:19:18.778975     1 proactor_pool.cc:66] Running 16 io threads
I20220602 16:19:18.784658     1 server_family.cc:186] Data directory is "/data"
I20220602 16:19:18.784929     1 server_family.cc:113] Checking "/data/dump"
I20220602 16:19:18.788194    10 listener_interface.cc:79] sock[51] AcceptServer - listening on port 6379
➜  ~  k --context staging-cq-state-us-east1-1 -n jason-test port-forward dragonfly-644c86468c-pkdvp 6379:6379
Forwarding from 127.0.0.1:6379 -> 6379
Forwarding from [::1]:6379 -> 6379
Handling connection for 6379
Handling connection for 6379

In another terminal, running redis-cli to validate it works:

➜  ~ echo "ping" | redis-cli
PONG
➜  ~ echo "info all" | redis-cli
# Server
redis_version:df-0.1
redis_mode:standalone
arch_bits:64
multiplexing_api:iouring
tcp_port:6379
uptime_in_seconds:604
uptime_in_days:0

# Clients
connected_clients:1
client_read_buf_capacity:256
blocked_clients:0

# Memory
used_memory:2642760
used_memory_human:2.52MiB
used_memory_peak:2642760
comitted_memory:65142784
used_memory_rss:12959744
used_memory_rss_human:12.36MiB
object_used_memory:0
table_used_memory:2290800
num_buckets:50400
num_entries:0
inline_keys:0
strval_bytes:0
listpack_blobs:0
listpack_bytes:0
small_string_bytes:0
maxmemory:105211671347
maxmemory_human:97.99GiB
cache_mode:store

# Stats
instantaneous_ops_per_sec:0
total_commands_processed:4
total_pipelined_commands:0
total_net_input_bytes:37
total_net_output_bytes:14423
instantaneous_input_kbps:-1
instantaneous_output_kbps:-1
rejected_connections:-1
expired_keys:0
evicted_keys:0
garbage_checked:0
garbage_collected:0
bump_ups:0
stash_unloaded:0
traverse_ttl_sec:0
delete_ttl_sec:0
keyspace_hits:-1
keyspace_misses:-1
total_reads_processed:2
total_writes_processed:2441
async_writes_count:0

# TIERED
external_entries:0
external_bytes:0
external_reads:0
external_writes:0
external_reserved:0
external_capacity:0

# PERSISTENCE
last_save:1654186758
last_save_file:

# Replication
role:master
connected_slaves:0

# Commandstats
cmd_PING:1
cmd_INFO:1
cmd_COMMAND:2

# Errorstats

# Keyspace
db0:keys=0,expires=0,avg_ttl=-1
drinkbeer commented 2 years ago

@romange , there is another way to pass ulimit -l unlimited config to shell in the pod, it also works in COS and older version of Linux (Kernel version <= 5.10). I validated in my GKE 1.22 cluster.

The yaml:

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: dragonfly
  namespace: jason-test
spec:
  selector:
    matchLabels:
      app: dragonfly
  template:
    metadata:
      labels:
        app: dragonfly
    spec:
      containers:
        - name: dragonfly
          image: docker.dragonflydb.io/dragonflydb/dragonfly
          command: ["/bin/sh"]
          args: ["-c", "ulimit -l unlimited && dragonfly --logtostderr"]
          resources:
            limits:
              cpu: "6"
              memory: 16000Mi
            requests:
              cpu: "2"
              memory: 12000Mi
          securityContext:
            privileged: true
      tolerations:
      - effect: NoExecute
        key: app
        operator: Equal
        value: vecache
      - effect: NoExecute
        key: node.kubernetes.io/not-ready
        operator: Exists
        tolerationSeconds: 30
      - effect: NoExecute
        key: node.kubernetes.io/unreachable
        operator: Exists
        tolerationSeconds: 30
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: dragonfly-pdb
  namespace: jason-test
spec:
  minAvailable: 1
  selector:
    matchLabels:
      app: dragonfly
---
apiVersion: v1
kind: Service
metadata:
  name: dragonfly
  namespace: jason-test
spec:
  selector:
    app: dragonfly
  ports:
    - port: 6379
      targetPort: 6379
romange commented 2 years ago

@romange , there is another way to pass ulimit -l unlimited config to shell in the pod, it also works in COS and older version of Linux (Kernel version <= 5.10). I validated in my GKE 1.22 cluster.

The yaml:

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: dragonfly
  namespace: jason-test
spec:
  selector:
    matchLabels:
      app: dragonfly
  template:
    metadata:
      labels:
        app: dragonfly
    spec:
      containers:
        - name: dragonfly
          image: docker.dragonflydb.io/dragonflydb/dragonfly
          command: ["/bin/sh"]
          args: ["-c", "ulimit -l unlimited && dragonfly --logtostderr"]
          resources:
            limits:
              cpu: "6"
              memory: 16000Mi
            requests:
              cpu: "2"
              memory: 12000Mi
          securityContext:
            privileged: true
      tolerations:
      - effect: NoExecute
        key: app
        operator: Equal
        value: vecache
      - effect: NoExecute
        key: node.kubernetes.io/not-ready
        operator: Exists
        tolerationSeconds: 30
      - effect: NoExecute
        key: node.kubernetes.io/unreachable
        operator: Exists
        tolerationSeconds: 30
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: dragonfly-pdb
  namespace: jason-test
spec:
  minAvailable: 1
  selector:
    matchLabels:
      app: dragonfly
---
apiVersion: v1
kind: Service
metadata:
  name: dragonfly
  namespace: jason-test
spec:
  selector:
    app: dragonfly
  ports:
    - port: 6379
      targetPort: 6379

nice. Well, that settles it. I would not want to maintain 2 images just because of this inconvenience with K8S. One little simplification I want to try is to set memlock limits inside /etc/security/limits.conf on the image for dfly user that runs dragonfly. Maybe in this case we won't need to wrap dragonfly with the shell command and ulimit.