yugabyte / yugabyte-db

YugabyteDB - the cloud native distributed SQL database for mission-critical applications.
https://www.yugabyte.com
Other
8.94k stars 1.06k forks source link

[YCQL Packed Rows] [Stress Tests] Core dump with SIGSEGV on libyb_util.so`yb::PrometheusWriter::WriteSingleEntry - handler_latency_yb_ysqlserver_SQLProcessor_SelectStmt_count after Tserver Restart Nemesis #19456

Open zlareb1-yb opened 1 year ago

zlareb1-yb commented 1 year ago

Jira Link: DB-8258

Description

Steps: image

Core dump Logs:

(lldb) target create "/home/yugabyte/yb-software/yugabyte-2.18.4.0-b37-centos-x86_64/postgres/bin/postgres" --core "/home/yugabyte/cores/core_421248_1696572712_!home!yugabyte!yb-software!yugabyte-2.18.4.0-b37-centos-x86_64!postgres!bin!postgres"
Core file '/home/yugabyte/cores/core_421248_1696572712_!home!yugabyte!yb-software!yugabyte-2.18.4.0-b37-centos-x86_64!postgres!bin!postgres' (x86_64) was loaded.
(lldb) bt all
warning: This version of LLDB has no plugin for the language "assembler". Inspection of frame variables will be limited.
* thread #1, name = 'postgres', stop reason = signal SIGSEGV
  * frame #0: 0x00007f46f19fef35 libyb_util.so`std::__1::__hash_const_iterator<std::__1::__hash_node<std::__1::__hash_value_type<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>, void*>*> std::__1::__hash_table<std::__1::__hash_value_type<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>, std::__1::__unordered_map_hasher<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::__hash_value_type<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>, std::__1::hash<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>, std::__1::equal_to<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>, true>, std::__1::__unordered_map_equal<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::__hash_value_type<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>, std::__1::equal_to<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>, std::__1::hash<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>, true>, std::__1::allocator<std::__1::__hash_value_type<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>>>::find<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>(this=<unavailable>, __k=<unavailable>) const at __hash_table:2338:31
    frame #1: 0x00007f46f19fea66 libyb_util.so`yb::PrometheusWriter::WriteSingleEntry(std::__1::unordered_map<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::hash<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>, std::__1::equal_to<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>, std::__1::allocator<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>>> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, long, yb::AggregationFunction, char const*, char const*) [inlined] std::__1::unordered_map<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::hash<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>, std::__1::equal_to<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>, std::__1::allocator<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>>>::find[abi:v15007](this=0x00007f46f320b860, __k="table_id") const at unordered_map:1445:69
    frame #2: 0x00007f46f19fea50 libyb_util.so`yb::PrometheusWriter::WriteSingleEntry(this=0x00007f46e62df7d8, attr=0x00007f46f320b860, name="handler_latency_yb_ysqlserver_SQLProcessor_SelectStmt_count", value=17, aggregation_function=kSum, type="", description="") at metrics_writer.cc:125:28
    frame #3: 0x00007f46f31cd5aa libserver_process.so`yb::PgPrometheusMetricsHandler(req=<unavailable>, resp=<unavailable>) at pgsql_webserver_wrapper.cc:324:5
    frame #4: 0x00007f46f31f2640 libserver_process.so`yb::Webserver::Impl::RunPathHandler(yb::Webserver::Impl::PathHandler const&, sq_connection*, sq_request_info*) [inlined] std::__1::__function::__value_func<void (yb::WebCallbackRegistry::WebRequest const&, yb::WebCallbackRegistry::WebResponse*)>::operator(this=0x00005652da42d7d0, __args=0x00007f46e62e1b40, __args=0x00007f46e62e1a10)[abi:v15007](yb::WebCallbackRegistry::WebRequest const&, yb::WebCallbackRegistry::WebResponse*&&) const at function.h:512:16
    frame #5: 0x00007f46f31f2627 libserver_process.so`yb::Webserver::Impl::RunPathHandler(yb::Webserver::Impl::PathHandler const&, sq_connection*, sq_request_info*) [inlined] std::__1::function<void (yb::WebCallbackRegistry::WebRequest const&, yb::WebCallbackRegistry::WebResponse*)>::operator(this= Function = yb::PgPrometheusMetricsHandler(yb::WebCallbackRegistry::WebRequest const&, yb::WebCallbackRegistry::WebResponse*) , __arg=0x00007f46e62e1b40, __arg=0x00007f46e62dfa10)(yb::WebCallbackRegistry::WebRequest const&, yb::WebCallbackRegistry::WebResponse*) const at function.h:1197:12
    frame #6: 0x00007f46f31f2627 libserver_process.so`yb::Webserver::Impl::RunPathHandler(this=0x00005652da454280, handler=<unavailable>, connection=0x00005652da4ba000, request_info=<unavailable>) at webserver.cc:625:5
    frame #7: 0x00007f46f31f20bd libserver_process.so`yb::Webserver::Impl::BeginRequestCallback(this=0x00005652da454280, connection=0x00005652da4ba000, request_info=0x00005652da4ba000) at webserver.cc:557:10
    frame #8: 0x00007f46f31fec78 libserver_process.so`process_new_connection + 4184
    frame #9: 0x00007f46f31fd928 libserver_process.so`worker_thread + 232
    frame #10: 0x00007f46f612b694 libpthread.so.0`start_thread(arg=0x00007f46e62f1700) at pthread_create.c:333
    frame #11: 0x00007f46f586841d libc.so.6`__clone at clone.S:109
  thread #2, stop reason = signal 0
    frame #0: 0x00007f46f585f5cd libc.so.6`poll at syscall-template.S:84
    frame #1: 0x00007f46f31fd4cc libserver_process.so`master_thread + 572
    frame #2: 0x00007f46f612b694 libpthread.so.0`start_thread(arg=0x00007f46e6af2700) at pthread_create.c:333
    frame #3: 0x00007f46f586841d libc.so.6`__clone at clone.S:109
  thread #3, stop reason = signal 0
    frame #0: 0x00007f46f2365920 libyb_bfql.so`std::__1::__shared_ptr_emplace<yb::bfql::OPERATOR_ListFrozen_158, std::__1::allocator<yb::bfql::OPERATOR_ListFrozen_158>>::__on_zero_shared(this=0x00005652da39f860) at shared_ptr.h:305
    frame #1: 0x00007f46f236307e libyb_bfql.so`std::__1::vector<std::__1::shared_ptr<yb::bfql::BFOperator>, std::__1::allocator<std::__1::shared_ptr<yb::bfql::BFOperator>>>::~vector[abi:v15007]() [inlined] std::__1::__shared_count::__release_shared[abi:v15007](this=0x00005652da39f860) at shared_ptr.h:174:9
    frame #2: 0x00007f46f2363063 libyb_bfql.so`std::__1::vector<std::__1::shared_ptr<yb::bfql::BFOperator>, std::__1::allocator<std::__1::shared_ptr<yb::bfql::BFOperator>>>::~vector[abi:v15007]() [inlined] std::__1::__shared_weak_count::__release_shared[abi:v15007](this=0x00005652da39f860) at shared_ptr.h:215:27
    frame #3: 0x00007f46f2363063 libyb_bfql.so`std::__1::vector<std::__1::shared_ptr<yb::bfql::BFOperator>, std::__1::allocator<std::__1::shared_ptr<yb::bfql::BFOperator>>>::~vector[abi:v15007]() [inlined] std::__1::shared_ptr<yb::bfql::BFOperator>::~shared_ptr[abi:v15007](this=0x00005652da3a09e0) at shared_ptr.h:702:23
    frame #4: 0x00007f46f236305e libyb_bfql.so`std::__1::vector<std::__1::shared_ptr<yb::bfql::BFOperator>, std::__1::allocator<std::__1::shared_ptr<yb::bfql::BFOperator>>>::~vector[abi:v15007]() [inlined] void std::__1::__destroy_at[abi:v15007]<std::__1::shared_ptr<yb::bfql::BFOperator>, 0>(__loc=0x00005652da3a09e0) at construct_at.h:63:13
    frame #5: 0x00007f46f236305e libyb_bfql.so`std::__1::vector<std::__1::shared_ptr<yb::bfql::BFOperator>, std::__1::allocator<std::__1::shared_ptr<yb::bfql::BFOperator>>>::~vector[abi:v15007]() [inlined] void std::__1::destroy_at[abi:v15007]<std::__1::shared_ptr<yb::bfql::BFOperator>, 0>(__loc=0x00005652da3a09e0) at construct_at.h:88:5
    frame #6: 0x00007f46f236305e libyb_bfql.so`std::__1::vector<std::__1::shared_ptr<yb::bfql::BFOperator>, std::__1::allocator<std::__1::shared_ptr<yb::bfql::BFOperator>>>::~vector[abi:v15007]() [inlined] void std::__1::allocator_traits<std::__1::allocator<std::__1::shared_ptr<yb::bfql::BFOperator>>>::destroy[abi:v15007]<std::__1::shared_ptr<yb::bfql::BFOperator>, void, void>((null)=0x00007f46f2387160, __p=0x00005652da3a09e0) at allocator_traits.h:317:9
    frame #7: 0x00007f46f236305e libyb_bfql.so`std::__1::vector<std::__1::shared_ptr<yb::bfql::BFOperator>, std::__1::allocator<std::__1::shared_ptr<yb::bfql::BFOperator>>>::~vector[abi:v15007]() [inlined] std::__1::vector<std::__1::shared_ptr<yb::bfql::BFOperator>, std::__1::allocator<std::__1::shared_ptr<yb::bfql::BFOperator>>>::__base_destruct_at_end[abi:v15007](this=0x00007f46f2387150, __new_last=0x00005652da3a0000) at vector:843:9
    frame #8: 0x00007f46f2363036 libyb_bfql.so`std::__1::vector<std::__1::shared_ptr<yb::bfql::BFOperator>, std::__1::allocator<std::__1::shared_ptr<yb::bfql::BFOperator>>>::~vector[abi:v15007]() [inlined] std::__1::vector<std::__1::shared_ptr<yb::bfql::BFOperator>, std::__1::allocator<std::__1::shared_ptr<yb::bfql::BFOperator>>>::__clear[abi:v15007](this=0x00007f46f2387150) at vector:837:29
    frame #9: 0x00007f46f2363036 libyb_bfql.so`std::__1::vector<std::__1::shared_ptr<yb::bfql::BFOperator>, std::__1::allocator<std::__1::shared_ptr<yb::bfql::BFOperator>>>::~vector[abi:v15007]() [inlined] std::__1::vector<std::__1::shared_ptr<yb::bfql::BFOperator>, std::__1::allocator<std::__1::shared_ptr<yb::bfql::BFOperator>>>::__destroy_vector::operator(this=<unavailable>)[abi:v15007]() at vector:439:20
    frame #10: 0x00007f46f2363024 libyb_bfql.so`std::__1::vector<std::__1::shared_ptr<yb::bfql::BFOperator>, std::__1::allocator<std::__1::shared_ptr<yb::bfql::BFOperator>>>::~vector[abi:v15007](this=0x00007f46f2387150) at vector:449:67
    frame #11: 0x00007f46f57b7ccf libc.so.6`__cxa_finalize(d=0x00007f46f2386b70) at cxa_finalize.c:56
    frame #12: 0x00007f46f235441e libyb_bfql.so`__do_fini + 62
    frame #13: 0x00007f46f76bf32a ld.so`_dl_fini at dl-fini.c:235
    frame #14: 0x00007f46f57b7969 libc.so.6`__run_exit_handlers(status=1, listp=0x00007f46f5b185c0, run_list_atexit=true) at exit.c:82
    frame #15: 0x00007f46f57b79b5 libc.so.6`__GI_exit(status=<unavailable>) at exit.c:104
    frame #16: 0x00005652d6c7b1f2 postgres`proc_exit(code=1) at ipc.c:157:2
    frame #17: 0x00005652d6e5e4cf postgres`errfinish(dummy=<unavailable>) at elog.c:801:3
    frame #18: 0x00005652d6bdb1cd postgres`bgworker_die(postgres_signal_arg=<unavailable>) at bgworker.c:672:2
    frame #19: 0x00007f46f6133ba0 libpthread.so.0`__restore_rt
    frame #20: 0x00007f46f58689f3 libc.so.6`epoll_wait at syscall-template.S:84
    frame #21: 0x00005652d6c7d215 postgres`WaitEventSetWait [inlined] WaitEventSetWaitBlock(set=0x00005652da423e88, cur_timeout=-1, occurred_events=0x00007ffc4e7e5e88, nevents=1) at latch.c:1062:7
    frame #22: 0x00005652d6c7d207 postgres`WaitEventSetWait(set=0x00005652da423e88, timeout=-1, occurred_events=<unavailable>, nevents=1, wait_event_info=<unavailable>) at latch.c:1014:8
    frame #23: 0x00005652d6c7cc93 postgres`WaitLatchOrSocket(latch=0x00007f46efe8288c, wakeEvents=<unavailable>, sock=-1, timeout=<unavailable>, wait_event_info=117440512) at latch.c:399:7
    frame #24: 0x00005652d6c7cb90 postgres`WaitLatch(latch=<unavailable>, wakeEvents=<unavailable>, timeout=<unavailable>, wait_event_info=<unavailable>) at latch.c:353:9 [artificial]
    frame #25: 0x00007f46f0cf5640 yb_pg_metrics.so`webserver_worker_main(unused=<unavailable>) at yb_pg_metrics.c:376:3
    frame #26: 0x00005652d6bdb05c postgres`StartBackgroundWorker at bgworker.c:841:2
    frame #27: 0x00005652d6bf3e3c postgres`maybe_start_bgworkers [inlined] do_start_bgworker(rw=0x00005652da454000) at postmaster.c:6005:4
    frame #28: 0x00005652d6bf3ddf postgres`maybe_start_bgworkers at postmaster.c:6231:9
    frame #29: 0x00005652d6bf05e0 postgres`PostmasterMain(argc=25, argv=0x00005652da3ec000) at postmaster.c:1429:2
    frame #30: 0x00005652d6af5a4f postgres`PostgresServerProcessMain(argc=25, argv=0x00005652da3ec000) at main.c:234:3
    frame #31: 0x00005652d67bd502 postgres`main + 34
    frame #32: 0x00007f46f57a2825 libc.so.6`__libc_start_main(main=(postgres`main), argc=25, argv=0x00007ffc4e7e67d8, init=<unavailable>, fini=<unavailable>, rtld_fini=<unavailable>, stack_end=0x00007ffc4e7e67c8) at libc-start.c:289
    frame #33: 0x00005652d67bd419 postgres`_start at start.S:108

Stress report - http://stress.dev.yugabyte.com/stress_test/81d7f68d-f6bc-4edc-bc95-ad77dcc5ad0d

Version - 2.18.4.0-b37 Time - Test was running for 12 Hours Last passed build - 2.18.4.0-b20

Workload - Only YCQL workload was run in test

May be similar issue - https://github.com/yugabyte/yugabyte-db/issues/19447

cc: @rthallamko3 @spolitov @renjith-yb @kripasreenivasan

Issue Type

kind/bug

Warning: Please confirm that this issue does not contain any sensitive information

zlareb1-yb commented 1 year ago

Another similar issue - https://github.com/yugabyte/yugabyte-db/issues/19046

sushantrmishra commented 1 year ago

Possible duplicate of https://github.com/yugabyte/yugabyte-db/issues/18948 . We should reverify once fix for https://github.com/yugabyte/yugabyte-db/issues/18948 has landed.