dragonflydb / dragonfly

A modern replacement for Redis and Memcached
https://www.dragonflydb.io/
Other
25.8k stars 948 forks source link

Dragonfly panics on start with Docker #2391

Closed AlphaNecron closed 10 months ago

AlphaNecron commented 10 months ago

Describe the bug Dragonfly crashes on start with minimal configuration.

I20240109 06:27:39.990960     1 init.cc:70] dragonfly running in opt mode.
I20240109 06:27:39.991027     1 dfly_main.cc:800] Starting dragonfly df-v1.13.0-f39eac5bcaf7c8ffe5c433a0e8e15747391199d9
* Logs will be written to the first available of the following paths:
/tmp/dragonfly.*
./dragonfly.*
* For the available flags type dragonfly [--help | --helpfull]
* Documentation can be found at: https://www.dragonflydb.io/docs
F20240109 06:27:39.991153     1 dfly_main.cc:659] Check failed: res.size() == 2u (1 vs. 2)
*** Check failure stack trace: ***
    @     0x55b52bccb9f3  google::LogMessage::SendToLog()
    @     0x55b52bcc41b7  google::LogMessage::Flush()
    @     0x55b52bcc5b3f  google::LogMessageFatal::~LogMessageFatal()
    @     0x55b52b31250d  _ZZN4dfly12_GLOBAL__N_137UpdateResourceLimitsIfInsideContainerEPN2io11MemInfoDataEPmENKUlSt17basic_string_viewIcSt11char_traitsIcEES4_E0_clES8_S4_.isra.0
    @     0x55b52b2ee685  main
    @     0x7f159c90a083  __libc_start_main
    @     0x55b52b31080e  _start
    @              (nil)  (unknown)
*** SIGABRT received at time=1704781659 on cpu 4 ***
PC: @     0x7f159c92900b  (unknown)  raise
[failure_signal_handler.cc : 345] RAW: Signal 11 raised at PC=0x7f159c908941 while already in AbslFailureSignalHandler()
*** SIGSEGV received at time=1704781659 on cpu 4 ***
PC: @     0x7f159c908941  (unknown)  abort

To Reproduce Steps to reproduce the behavior:

  1. Install Docker Compose
  2. Clone example docker-compose.yml.
  3. Do docker compose up -d

Expected behavior It should be working properly.

Environment (please complete the following information):

Reproducible Code Snippet

version: '3'
services:
  dragonfly:
    image: 'docker.dragonflydb.io/dragonflydb/dragonfly'
    cpuset: '0-5'
    mem_reservation: '1G'
    mem_limit: '2G'
    restart: unless-stopped
    ulimits:
      memlock: -1
    network_mode: host
#    ports:
#      - 6379:6379
    command:
      - '--conn_io_threads=8'
      - '--use_zset_tree=true'
      - '--maxmemory=1536MB'
      - '--cache_mode=true'
    volumes:
      - dragonflydata:/data
volumes:
  dragonflydata:

This is my altered docker-compose.yml, Dragonfly still refuses to start with default docker-compose.yml.

romange commented 10 months ago

Thanks for reporting this. Can you please add to the command section --vmodule=dfly_main=1 , rerun, and attach the output of the failure again?

AlphaNecron commented 10 months ago

Thanks for reporting this. Can you please add to the command section --vmodule=dfly_main=1 , rerun, and attach the output of the failure again?

arctic-dragonfly-1  | I20240109 13:35:23.957870     1 dfly_main.cc:800] Starting dragonfly df-v1.13.0-f39eac5bcaf7c8ffe5c433a0e8e15747391199d9
arctic-dragonfly-1  | * Logs will be written to the first available of the following paths:
arctic-dragonfly-1  | /tmp/dragonfly.*
arctic-dragonfly-1  | ./dragonfly.*
arctic-dragonfly-1  | * For the available flags type dragonfly [--help | --helpfull]
arctic-dragonfly-1  | * Documentation can be found at: https://www.dragonflydb.io/docs
arctic-dragonfly-1  | I20240109 13:35:23.957948     1 dfly_main.cc:621] mem_path = /sys/fs/cgroup//
arctic-dragonfly-1  | I20240109 13:35:23.957954     1 dfly_main.cc:622] cpu_path = /sys/fs/cgroup//
arctic-dragonfly-1  | F20240109 13:35:23.958010     1 dfly_main.cc:659] Check failed: res.size() == 2u (1 vs. 2)
arctic-dragonfly-1  | *** Check failure stack trace: ***
arctic-dragonfly-1  |     @     0x55c78a2049f3  google::LogMessage::SendToLog()
arctic-dragonfly-1  |     @     0x55c78a1fd1b7  google::LogMessage::Flush()
arctic-dragonfly-1  |     @     0x55c78a1feb3f  google::LogMessageFatal::~LogMessageFatal()
arctic-dragonfly-1  |     @     0x55c78984b50d  _ZZN4dfly12_GLOBAL__N_137UpdateResourceLimitsIfInsideContainerEPN2io11MemInfoDataEPmENKUlSt17basic_string_viewIcSt11char_traitsIcEES4_E0_clES8_S4_.isra.0
arctic-dragonfly-1  |     @     0x55c789827685  main
arctic-dragonfly-1  |     @     0x7f9bb4a91083  __libc_start_main
arctic-dragonfly-1  |     @     0x55c78984980e  _start
arctic-dragonfly-1  |     @              (nil)  (unknown)
arctic-dragonfly-1  | *** SIGABRT received at time=1704807323 on cpu 0 ***
arctic-dragonfly-1  | PC: @     0x7f9bb4ab000b  (unknown)  raise
arctic-dragonfly-1  | [failure_signal_handler.cc : 345] RAW: Signal 11 raised at PC=0x7f9bb4a8f941 while already in AbslFailureSignalHandler()
arctic-dragonfly-1  | *** SIGSEGV received at time=1704807323 on cpu 0 ***
arctic-dragonfly-1  | PC: @     0x7f9bb4a8f941  (unknown)  abort
arctic-dragonfly-1 exited with code 139

This one is basically identical to the previous one...

romange commented 10 months ago

@AlphaNecron can you please run docker run -it alpine and then inside the container: cat /sys/fs/cgroup/cpu.max and paste here the output. thank you.

AlphaNecron commented 10 months ago
image

It's empty.

AlphaNecron commented 10 months ago

Here's docker version, just in case.

Client: Docker Engine - Community
 Version:           24.0.7
 API version:       1.43
 Go version:        go1.20.10
 Git commit:        afdd53b
 Built:             Thu Oct 26 09:08:02 2023
 OS/Arch:           linux/amd64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          24.0.7
  API version:      1.43 (minimum version 1.12)
  Go version:       go1.20.10
  Git commit:       311b9ff
  Built:            Thu Oct 26 09:08:02 2023
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.26
  GitCommit:        3dd1e886e55dd695541fdcd67420c2888645a495
 runc:
  Version:          1.1.10
  GitCommit:        v1.1.10-0-g18a0cb0
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0
AlphaNecron commented 10 months ago

Uhh, is there a nightly image for Dragonfly?

chakaz commented 10 months ago

@AlphaNecron - @romange posted this on the wrong issue, so I'm posting here on his behalf:

I kicked off our weekly build now. Check this out: https://github.com/dragonflydb/dragonfly/actions/runs/7472965174 should update https://github.com/dragonflydb/dragonfly/pkgs/container/dragonfly-weekly

AlphaNecron commented 10 months ago

Thanks a lot! Keep up the great work, just see how fast you guys are tackling issues :D

AlphaNecron commented 10 months ago

It works flawlessly now. It was working fine before apt upgrade, there were probably some breaking changes related to docker or cgroup.