ydb-platform / ydb

YDB is an open source Distributed SQL Database that combines high availability and scalability with strong consistency and ACID transactions
https://ydb.tech
Apache License 2.0
4k stars 565 forks source link

[pg] Segfault on grafana start #5246

Closed rekby closed 4 months ago

rekby commented 5 months ago
Stacktrace ``` GRpc memory quota was set but disabled due to issues with grpc quoter, to enable it use EnableGRpcMemoryQuota option AddressSanitizer:DEADLYSIGNAL ================================================================= ==9403==ERROR: AddressSanitizer: SEGV on unknown address (pc 0x7f3b367c531e bp 0x9eb1c626b344b985 sp 0xb795c626bc293779 T15) ==9403==The signal is caused by a READ memory access. ==9403==Hint: this fault was caused by a dereference of a high value address (see register values below). Disassemble the provided pc to learn which register was used. #0 0x7f3b367c531e setjmp/../sysdeps/x86_64/__longjmp.S:111 AddressSanitizer can not provide additional info. SUMMARY: AddressSanitizer: SEGV setjmp/../sysdeps/x86_64/__longjmp.S:111 Thread T15 (ydbd.User) created by T0 here: warning: address range table at offset 0x0 has a premature terminator entry at offset 0x10 warning: address range table at offset 0x30 has a premature terminator entry at offset 0x40 warning: address range table at offset 0x60 has a premature terminator entry at offset 0x70 warning: address range table at offset 0x90 has a premature terminator entry at offset 0xa0 warning: address range table at offset 0xc0 has a premature terminator entry at offset 0xd0 warning: address range table at offset 0xac0 has a premature terminator entry at offset 0xad0 warning: address range table at offset 0xb20 has a premature terminator entry at offset 0xb30 warning: address range table at offset 0xb50 has a premature terminator entry at offset 0xb60 warning: address range table at offset 0xb80 has a premature terminator entry at offset 0xb90 warning: address range table at offset 0xbb0 has a premature terminator entry at offset 0xbc0 warning: address range table at offset 0xbe0 has a premature terminator entry at offset 0xbf0 warning: address range table at offset 0xc10 has a premature terminator entry at offset 0xc20 warning: address range table at offset 0xc40 has a premature terminator entry at offset 0xc50 warning: address range table at offset 0xc70 has a premature terminator entry at offset 0xc80 warning: address range table at offset 0xca0 has a premature terminator entry at offset 0xcb0 warning: address range table at offset 0xcd0 has a premature terminator entry at offset 0xce0 warning: address range table at offset 0xd00 has a premature terminator entry at offset 0xd10 warning: address range table at offset 0xd30 has a premature terminator entry at offset 0xd40 warning: address range table at offset 0xd60 has a premature terminator entry at offset 0xd70 warning: address range table at offset 0xd90 has a premature terminator entry at offset 0xda0 warning: address range table at offset 0xdc0 has a premature terminator entry at offset 0xdd0 warning: address range table at offset 0xdf0 has a premature terminator entry at offset 0xe00 warning: address range table at offset 0xe20 has a premature terminator entry at offset 0xe30 warning: address range table at offset 0xe50 has a premature terminator entry at offset 0xe60 warning: address range table at offset 0xe80 has a premature terminator entry at offset 0xe90 warning: address range table at offset 0x13c0 has a premature terminator entry at offset 0x13d0 warning: address range table at offset 0x13f0 has a premature terminator entry at offset 0x1400 warning: address range table at offset 0x1420 has a premature terminator entry at offset 0x1430 warning: address range table at offset 0x1450 has a premature terminator entry at offset 0x1460 warning: address range table at offset 0x1570 has a premature terminator entry at offset 0x1580 warning: address range table at offset 0x15a0 has a premature terminator entry at offset 0x15b0 warning: address range table at offset 0x15d0 has a premature terminator entry at offset 0x15e0 warning: address range table at offset 0x1600 has a premature terminator entry at offset 0x1610 warning: address range table at offset 0x1630 has a premature terminator entry at offset 0x1640 warning: address range table at offset 0x1660 has a premature terminator entry at offset 0x1670 #0 0x2008422c in __interceptor_pthread_create /home/rekby/ydbwork/ydb/contrib/libs/clang16-rt/lib/asan/asan_interceptors.cpp:208:3 #1 0x202bc533 in (anonymous namespace)::TPosixThread::Start() /home/rekby/ydbwork/ydb/util/system/thread.cpp:229:27 #2 0x202bc533 in TThread::Start() /home/rekby/ydbwork/ydb/util/system/thread.cpp:314:34 #3 0x21e7fc83 in NActors::TBasicExecutorPool::Start() /home/rekby/ydbwork/ydb/ydb/library/actors/core/executor_pool_basic.cpp:466:32 #4 0x21e661b5 in NActors::TCpuManager::Start() /home/rekby/ydbwork/ydb/ydb/library/actors/core/cpu_manager.cpp:81:32 #5 0x21e5c4e4 in NActors::TActorSystem::Start() /home/rekby/ydbwork/ydb/ydb/library/actors/core/actorsystem.cpp:300:21 #6 0x33e75e2e in NKikimr::TKikimrRunner::KikimrStart() /home/rekby/ydbwork/ydb/ydb/core/driver_lib/run/run.cpp:1669:22 #7 0x334d1a30 in NKikimr::MainRun(NKikimr::TKikimrRunConfig const&, std::__y1::shared_ptr) /home/rekby/ydbwork/ydb/ydb/core/driver_lib/run/main.cpp:46:17 #8 0x33630dd1 in NKikimr::NDriverClient::TClientCommandServer::Run(NYdb::NConsoleClient::TClientCommand::TConfig&) /home/rekby/ydbwork/ydb/ydb/core/driver_lib/cli_utils/cli_cmds_server.cpp:51:12 #9 0x3352ca1e in NYdb::NConsoleClient::TClientCommandTree::Run(NYdb::NConsoleClient::TClientCommand::TConfig&) /home/rekby/ydbwork/ydb/ydb/public/lib/ydb_cli/common/command.cpp:374:33 #10 0x335200d0 in NKikimr::NDriverClient::NewClient(int, char**, std::__y1::shared_ptr) /home/rekby/ydbwork/ydb/ydb/core/driver_lib/cli_utils/cli_cmds_root.cpp:82:26 #11 0x334d3082 in NKikimr::Main(int, char**, std::__y1::shared_ptr) /home/rekby/ydbwork/ydb/ydb/core/driver_lib/run/main.cpp:148:20 #12 0x334d47cb in ParameterizedMain(int, char**, std::__y1::shared_ptr) /home/rekby/ydbwork/ydb/ydb/core/driver_lib/run/main.cpp:198:16 #13 0x2002930d in main /home/rekby/ydbwork/ydb/ydb/apps/ydbd/main.cpp:24:12 #14 0x7f3b367acd8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16 ==9403==ABORTING ```

Reproduce with grafana:

  1. Run local YDB https://github.com/ydb-platform/ydb/wiki/Local-run-postgres-tests, with export ASAN_SYMBOLIZER_PATH=$(ya tool llvm-symbolizer --print-path)
  2. Clone https://github.com/ydb-platform/postgres-compatibility-tests (commit 3c21c218f087c942908d1d62d06edfc3e5c1f044)
  3. From root of repository run:
    cat ./manual-in-progress/grafana/ydb-schema.sql | psql postgres://root:1234@localhost:5432/local
    cat ./manual-in-progress/grafana/ydb-data.sql | psql postgres://root:1234@localhost:5432/local
    docker-compose -f ./manual-in-progress/grafana/docker-compose.yaml down -vt 1 && docker-compose -f ./manual-in-progress/grafana/docker-compose.yaml up

After docker-compose command ydbd fails without backtrace within coredump.

I can see the stacktrace if (flacky):

  1. Run ydbd with ASAN_SYMBOLIZER_PATH
  2. Attach with gdb to the process: sudo ./ya tool gdb ~/ydbwork/ydb/ydb/apps/ydbd/ydbd --pid YDBD_PID, then execue continue.
  3. Run docker-compose -f ./manual-in-progress/grafana/docker-compose.yaml down -vt 1 && docker-compose -f ./manual-in-progress/grafana/docker-compose.yaml up for ydbd fails
  4. exit from gdb with detach from the process
  5. Restart ydbd through ydb_local
rekby commented 4 months ago

can't reproduce