yugabyte / yugabyte-db

YugabyteDB - the cloud native distributed SQL database for mission-critical applications.
https://www.yugabyte.com
Other
9k stars 1.07k forks source link

[YSQL][LST] Postgres crash in yb::SharedMemorySegment::GetAddress #13141

Open def- opened 2 years ago

def- commented 2 years ago

Jira Link: DB-2843

Description

$ cd ~/code/yugabyte-db
$ git checkout 226465305d7db859424e06178723b445b6bf964c
$ ./yb_build.sh release
$ bin/yb-ctl --replication_factor 3 create --tserver_flags=yb_enable_read_committed_isolation=true,ysql_enable_packed_row=true,ysql_num_shards_per_tserver=1,enable_stream_compression=true,stream_compression_algo=1,yb_num_shards_per_tserver=1 --master_flags=yb_enable_read_committed_isolation=true,ysql_enable_packed_row=true,enable_stream_compression=true,stream_compression_algo=1,enable_automatic_tablet_splitting=true,tablet_split_low_phase_shard_count_per_node=40,tablet_split_high_phase_shard_count_per_node=50
$ cd ~/code/yb-long-system-test
$ git checkout e78e1e69c8ca04d8019d54703f1fe70ab75ca3e5
$ ./long_system_test.py --nodes=127.0.0.1:5433,127.0.0.2:5433,127.0.0.3:5433 --threads=10 --complexity=full --runtime=60 --max-columns=10 --seed=564357
(gdb) bt
#0  yb::SharedMemorySegment::GetAddress (this=0x0) at ../../../../../src/yb/util/shared_mem.cc:267
#1  0x00007f4a9b5ab9fb in yb::SharedMemoryObject<yb::tserver::TServerSharedData>::get (this=0x0)
    at ../../../../../../src/yb/util/shared_mem.h:88
#2  yb::SharedMemoryObject<yb::tserver::TServerSharedData>::operator* (this=0x0)
    at ../../../../../../src/yb/util/shared_mem.h:96
#3  yb::pggate::PgClient::Impl::Start (this=0x1c70900, proxy_cache=0x1c8d080, scheduler=0x1e12270,
    tserver_shared_object=...) at ../../../../../../src/yb/yql/pggate/pg_client.cc:118
#4  0x00007f4a9b5ab0c1 in yb::pggate::PgClient::Start (this=<optimized out>, proxy_cache=0x1c8d080,
    scheduler=0x1e12270, tserver_shared_object=...)
    at ../../../../../../src/yb/yql/pggate/pg_client.cc:572
#5  0x00007f4a9b59aae1 in yb::pggate::PgApiImpl::PgApiImpl (this=<optimized out>, context=...,
    YBCDataTypeArray=<optimized out>, count=162, callbacks=...)
    at ../../../../../../src/yb/yql/pggate/pggate.cc:434
#6  0x00007f4a9b58f55f in yb::pggate::YBCInitPgGateEx (data_type_table=0x9fd330 <YbTypeEntityTable>,
    count=162, pg_callbacks=..., context=context@entry=0x0)
    at ../../../../../../src/yb/yql/pggate/ybc_pggate.cc:131
#7  0x00007f4a9b58f76c in YBCInitPgGate (data_type_table=0x0, count=29821184, pg_callbacks=...)
    at ../../../../../../src/yb/yql/pggate/ybc_pggate.cc:139
#8  0x00000000009c2b66 in YBInitPostgresBackend (program_name=<optimized out>, db_name=0x316418 "",
    user_name=user_name@entry=0x0)
    at ../../../../../../../src/postgres/src/backend/utils/misc/pg_yb_utils.c:519
#9  0x00000000009a7715 in InitPostgresImpl (in_dbname=0x0, dboid=0, username=0x0, useroid=0,
    out_dbname=0x0, override_allow_connections=<optimized out>,
    yb_sys_table_prefetching_started=<optimized out>)
    at ../../../../../../../src/postgres/src/backend/utils/init/postinit.c:681
#10 InitPostgres (in_dbname=in_dbname@entry=0x0, dboid=dboid@entry=0, username=username@entry=0x0,
    useroid=useroid@entry=0, out_dbname=out_dbname@entry=0x0, override_allow_connections=false)
    at ../../../../../../../src/postgres/src/backend/utils/init/postinit.c:1148
#11 0x00000000005674bc in BootstrapModeMain ()
    at ../../../../../../src/postgres/src/backend/bootstrap/bootstrap.c:509
#12 0x00000000005671fa in AuxiliaryProcessMain (argc=5, argc@entry=6, argv=0x1db8ac8,
    argv@entry=0x1db8ac0) at ../../../../../../src/postgres/src/backend/bootstrap/bootstrap.c:442
#13 0x0000000000712785 in PostgresServerProcessMain (argc=6, argv=0x1db8ac0)
    at ../../../../../../src/postgres/src/backend/main/main.c:226
#14 0x0000000000712ca2 in main ()

I see that there is a similar bug when using wrong ip address, but I have not done so here: https://github.com/yugabyte/yugabyte-db/issues/11390

tedyu commented 2 years ago

I saw the above stack trace more than once when investigating https://github.com/yugabyte/yugabyte-db/issues/13546

In InitTServerSharedObject

  if (FLAGS_TEST_pggate_ignore_tserver_shm || FLAGS_pggate_tserver_shm_fd == -1) {
     return nullptr;

default value for FLAGS_pggate_tserver_shm_fd is -1. By default, nullptr would be returned from InitTServerSharedObject

def- commented 2 years ago

This one is easy to reproduce and happens all the time