Open shamanthchandra-yb opened 1 year ago
Observed similar error in Packed YCQL Tests (which doesn't have any YSQL workload) as well:
(lldb) target create "/home/yugabyte/yb-software/yugabyte-2.18.5.0-b70-centos-x86_64/postgres/bin/postgres" --core "/home/yugabyte/cores/core_248834_1700740904_!home!yugabyte!yb-software!yugabyte-2.18.5.0-b70-centos-x86_64!postgres!bin!postgres"
Core file '/home/yugabyte/cores/core_248834_1700740904_!home!yugabyte!yb-software!yugabyte-2.18.5.0-b70-centos-x86_64!postgres!bin!postgres' (x86_64) was loaded.
(lldb) bt all
warning: This version of LLDB has no plugin for the language "assembler". Inspection of frame variables will be limited.
* thread #1, name = 'postgres', stop reason = signal SIGABRT
* frame #0: 0x00007f7647d380a7 libc.so.6`__GI_raise(sig=6) at raise.c:54
frame #1: 0x00007f7647d394aa libc.so.6`__GI_abort at abort.c:89
frame #2: 0x00007f7643cab88a libglog.so.0`google::LogMessage::Flush() + 346
frame #3: 0x00007f7643cab632 libglog.so.0`google::LogMessage::~LogMessage() + 18
frame #4: 0x00007f7645764e89 libserver_process.so`yb::Webserver::Impl::LogMessageCallbackStatic(connection=<unavailable>, message="Failed to enter worker thread") at webserver.cc:523:5
frame #5: 0x00007f764576c5b8 libserver_process.so`cry + 200
frame #6: 0x00007f764577197e libserver_process.so`worker_thread + 94
frame #7: 0x00007f76486ae694 libpthread.so.0`start_thread(arg=0x00007f7638856700) at pthread_create.c:333
frame #8: 0x00007f7647deb41d libc.so.6`__clone at clone.S:109
thread #2, stop reason = signal 0
frame #0: 0x00007f7643237e10 pg_hint_plan.so`__do_fini
frame #1: 0x00007f7649c4332a ld.so`_dl_fini at dl-fini.c:235
frame #2: 0x00007f7647d3a969 libc.so.6`__run_exit_handlers(status=1, listp=0x00007f764809b5c0, run_list_atexit=true) at exit.c:82
frame #3: 0x00007f7647d3a9b5 libc.so.6`__GI_exit(status=<unavailable>) at exit.c:104
frame #4: 0x0000558674adba62 postgres`proc_exit(code=1) at ipc.c:157:2
frame #5: 0x0000558674cbe8af postgres`errfinish(dummy=<unavailable>) at elog.c:801:3
frame #6: 0x0000558674a3b91d postgres`bgworker_die(postgres_signal_arg=<unavailable>) at bgworker.c:672:2
frame #7: 0x00007f76486b6ba0 libpthread.so.0`__restore_rt
frame #8: 0x00007f7647deb9f3 libc.so.6`epoll_wait at syscall-template.S:84
frame #9: 0x0000558674adda85 postgres`WaitEventSetWait [inlined] WaitEventSetWaitBlock(set=0x000055867870de88, cur_timeout=-1, occurred_events=0x00007ffe3605a0a8, nevents=1) at latch.c:1062:7
frame #10: 0x0000558674adda77 postgres`WaitEventSetWait(set=0x000055867870de88, timeout=-1, occurred_events=<unavailable>, nevents=1, wait_event_info=<unavailable>) at latch.c:1014:8
frame #11: 0x0000558674add503 postgres`WaitLatchOrSocket(latch=0x00007f76423e788c, wakeEvents=<unavailable>, sock=-1, timeout=<unavailable>, wait_event_info=117440512) at latch.c:399:7
frame #12: 0x0000558674add400 postgres`WaitLatch(latch=<unavailable>, wakeEvents=<unavailable>, timeout=<unavailable>, wait_event_info=<unavailable>) at latch.c:353:9 [artificial]
frame #13: 0x00007f76432595d0 yb_pg_metrics.so`webserver_worker_main(unused=<unavailable>) at yb_pg_metrics.c:376:3
frame #14: 0x0000558674a3b7ac postgres`StartBackgroundWorker at bgworker.c:841:2
frame #15: 0x0000558674a5411c postgres`maybe_start_bgworkers [inlined] do_start_bgworker(rw=0x000055867873e000) at postmaster.c:6033:4
frame #16: 0x0000558674a540bf postgres`maybe_start_bgworkers at postmaster.c:6259:9
frame #17: 0x0000558674a50980 postgres`PostmasterMain(argc=25, argv=0x00005586786d6000) at postmaster.c:1429:2
frame #18: 0x000055867495623f postgres`PostgresServerProcessMain(argc=25, argv=0x00005586786d6000) at main.c:234:3
frame #19: 0x000055867461f062 postgres`main + 34
frame #20: 0x00007f7647d25825 libc.so.6`__libc_start_main(main=(postgres`main), argc=25, argv=0x00007ffe3605a9f8, init=<unavailable>, fini=<unavailable>, rtld_fini=<unavailable>, stack_end=0x00007ffe3605a9e8) at libc-start.c:289
frame #21: 0x000055867461ef79 postgres`_start at start.S:108
thread #3, stop reason = signal 0
frame #0: 0x00007f7647de25cd libc.so.6`poll at syscall-template.S:84
frame #1: 0x00007f76457715ac libserver_process.so`master_thread + 572
frame #2: 0x00007f76486ae694 libpthread.so.0`start_thread(arg=0x00007f7639057700) at pthread_create.c:333
frame #3: 0x00007f7647deb41d libc.so.6`__clone at clone.S:109
Stress test link - http://stress.dev.yugabyte.com/stress_test/09501ea0-5786-4e0c-b996-7e3dbae0d5b6 Version - 2.18.5.0-b70
cc: @SergeyPotachev @rthallamko3 @renjith-yb @kripasreenivasan @shamanthchandra-yb
Observed again, run was on 2.20.2.0-b109 test_cdc_with_consistency_bank_tx_before_image
Run link in JIRA comments. https://yugabyte.atlassian.net/browse/DB-8108?focusedCommentId=95970
@m-iancu last week we discussed, that there were few fixes went in around webserver recently, and you had asked for runs where it hits. Here is the latest occurrence of this issue: https://yugabyte.atlassian.net/browse/DB-8108?focusedCommentId=97607
Jira Link: DB-8108
Description
This is stress case which has nemesis, so, this could be found in one of the random combination.
PFA for stress report.
Last sample apps I was running was:
java -jar /tmp/tests/artifacts/stress-sample-app-tool/yb-stress-sample-apps-1.1.18.jar --workload SqlDataLoad --default_postgres_database cdc_6b7aa5 --num_writes 334000 --num_threads_write 26 --num_threads_read 0 --num_reads 0 --num_unique_keys 100000000000 --batch_size 195 --num_value_columns 17 --create_table_name test_cdc_3c1703 --skip_ddl --uuid_column --uuid 3eecdbb6-317c-4dbe-9b32-fae687fdfcd3 --large_key_multiplier 3 --large_value_multiplier 3 --uuid_marker 41a4db05-c5c0-4838-a0d6-e2730f1283ec --nodes 172.151.26.173:5433,172.151.21.231:5433,172.151.17.255:5433
Warning: Please confirm that this issue does not contain any sensitive information