yugabyte / yugabyte-db

YugabyteDB - the cloud native distributed SQL database for mission-critical applications.
https://www.yugabyte.com
Other
9.04k stars 1.08k forks source link

[DocDB] The namespace is in process of deletion due to internal error while creating more then 400 databases #19903

Open pilshchikov opened 1 year ago

pilshchikov commented 1 year ago

Jira Link: DB-8849

Description

Case:

  1. On 3 nodes cluster, RF=3, c5.4xlarge
  2. Start creating databases with 10 tables per each dataase and insert 10000 value in one table with generated_series
  3. Every 60 tables doing manual compaction for this 60 tables
  4. on 410 database all next queries failing with error (java client):
    com.yugabyte.util.PSQLException: ERROR: Namespace Create Failed: The namespace is in process of deletion due to internal error.
    • All next queries with create database throwing same error
    • version: 2.20.1.0-b17
    • Same error on 2.16/2.18/2.21
    • Case reproduces even if we create one table in database
    • On 2.12.8.0-b5 able to create 478 databases, but it failed only because timeout of create query reached 2 minutes (default timeout)

All logs in first jira comment

Issue Type

kind/bug

Warning: Please confirm that this issue does not contain any sensitive information

image
lingamsandeep commented 7 months ago

The cause for this is a memory allocation failure with this stack:

tcmalloc::allocate_full_malloc_oom()
_malloc_zone_malloc_instrumented_or_legacy
yb::malloc_with_check()
yb::RefCntBuffer::RefCntBuffer()
yb::RefCntBuffer::RefCntBuffer()
yb::rpc::CallData::CallData()
yb::rpc::CallData::CallData()
yb::rpc::BinaryCallParser::Parse()
yb::rpc::YBInboundConnectionContext::ProcessCalls()
yb::rpc::Connection::ProcessReceived()
yb::rpc::RefinedStream::ProcessReceived()
yb::rpc::TcpStream::TryProcessReceived()
yb::rpc::TcpStream::ReadHandler()
yb::rpc::TcpStream::Handler()
ev::base<>::method_thunk<>()
ev_invoke_pending
ev_run
ev::loop_ref::run()
yb::rpc::Reactor::RunThread()

During CreateDatabase as part of CopyPgsqlSysTables/CreateYsqlSysTable, we are mutating the syscatalog tablet to add the tableId for every pg catalog table. There are 108 pg tables. This results in 108 kv updates to syscatalog tablet. This could have been batched to do a single kv update to include all the tableIds in one shot. The write_op generated as a result is also large - resulting in a large memory allocation when the followers receive this Write_op which can fail. To add to this - the syscatalog maintains a single KV with all the table_ids , so every new database creation increases the size of the value entry for this kv and eventually with enough databases in the system, the next create will result in a WRITE_OP that will trigger a memory allocation that can be large enough that it fails on the FOLLOWERS - causing the create database to fail.

lingamsandeep commented 7 months ago

22046 is the same issue as this one.