ydb-platform / ydb

YDB is an open source Distributed SQL Database that combines high availability and scalability with strong consistency and ACID transactions
https://ydb.tech
Apache License 2.0
4k stars 565 forks source link

Maybe deadlock in c++ ydb sdk #2944

Open kardymonds opened 7 months ago

kardymonds commented 7 months ago

Это копия задачи [YDBREQUESTS-1561) Далее текст и комментрии просто скопированы из задачи.

Я использую с++ sdk в async стиле

        static constexpr auto query = R(
            DECLARE $shard_key AS Uint64;
            DECLARE $yandexid AS String;

            SELECT device_id
                FROM device_subscriptions
            WHERE shard_key = $shard_key
                AND yandexid = $yandexid
                AND subscribed = false;
        )";

        auto params = session.GetParamsBuilder()
            .AddParam("$shard_key")
                .Uint64(LegacyComputeShardKey(puid))
                .Build()
            .AddParam("$yandexid")
                .String(puid)
                .Build()
            .Build();

        return session.ExecuteDataQuery(
            query,
            NYdb::NTable::TTxControl::BeginTx(NYdb::NTable::TTxSettings::OnlineRO()).CommitTx(),
            std::move(params),
            ExecDataQuerySettings_
        ).Apply([resultSet](const NYdb::NTable::TAsyncDataQueryResult& fut) mutable -> NYdb::TStatus {
            const auto& res = fut.GetValueSync();
            if (res.IsSuccess()) {
                *resultSet = res.GetResultSet(0);
            }

            return res;
        });
    }, RetryOperationSettings_).Apply([resultSet, operationContext = std::move(operationContext)](const NYdb::TAsyncStatus& fut) mutable -> TExpected<TVector<TString>, TString> {
        const auto& status = fut.GetValueSync();
        if (const auto opRes = operationContext.ReportResult(status); !opRes) {
            return opRes.Error();
        }

        NYdb::TResultSetParser parser(resultSet->GetRef());
        TVector<TString> result;
        result.reserve(parser.RowsCount());
        while (parser.TryNextRow()) {
            result.emplace_back(*parser.ColumnParser("device_id").GetOptionalString());
        }

        return result;
    });

И в таком коде у меня много цепочек, т.е. вполне может быть

    [...](...) {
        // ...
        return AnotherYdbRequest().Apply(
            // ...
        );
    }
)

Т.е. Я умею делать долгие цепочки Apply’ев с верой что ydb всегда выставит promise.

В настройках драйвера я выставляю такие опции:

   auto driverConfig = NYdb::TDriverConfig()
        .SetEndpoint(config.GetAddress())
        .SetDatabase(config.GetDBName())
        .SetAuthToken(GetEnv("YDB_TOKEN"))
        .SetBalancingPolicy(NYdb::EBalancingPolicy::UsePreferableLocation)
        .SetNetworkThreadsNum(config.GetNetworkThreads())
        .SetDiscoveryMode(NYdb::EDiscoveryMode::Async);

Т.е. я не выставяю опции SetClientThreadsNum и SetMaxClientQueueSize из-за которых может случится deadlock (у меня используется adaptive pool).

В настройках table клиента я выставляю такие опции:

    , Client_(
        driver,
        NYdb::NTable::TClientSettings()
            .UseQueryCache(false)
            .SessionPoolSettings(
                NYdb::NTable::TSessionPoolSettings()
                    .MaxActiveSessions(config.GetMaxActiveSessions())
            )
    )
    , RetryOperationSettings_(NYdb::NTable::TRetryOperationSettings().MaxRetries(config.GetMaxRetries()))
    , ExecDataQuerySettings_(
        NYdb::NTable::TExecDataQuerySettings()
            .OperationTimeout(FromString<TDuration>(config.GetOperationTimeout()))
            .ClientTimeout(FromString<TDuration>(config.GetClientTimeout()))
            .CancelAfter(FromString<TDuration>(config.GetCancelAfter()))
            .KeepInQueryCache(true)
    )

(Именно благодаря опции CancelAfter я верю что значение future с запросом будет выставлено)

Если это важно - у меня поверх одного драйвера может быть 2+ table клиента.

И примерно при таком setup’е я получил deadlock

Все мои thread’ы заблокированы в ожидании Add adaptive pool’а:

0  0x00007fec79b27ad3 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00000000025e3e34 in TCondVar::TImpl::WaitD (this=0x474c3fc9673c, lock=..., deadLine=...) at /place/sandbox-data/tasks/7/0/1382850307/__FUSE/mount_path_ccf18fa7-3c1f-4017-a724-9e910c2ec9b1/util/system/condvar.cpp:99
#2  TCondVar::WaitD (this=0x474c3fb000f0, mutex=..., deadLine=...) at /place/sandbox-data/tasks/7/0/1382850307/__FUSE/mount_path_ccf18fa7-3c1f-4017-a724-9e910c2ec9b1/util/system/condvar.cpp:144
#3  TCondVar::WaitI (this=0x474c3fb000f0, m=...) at /place/sandbox-data/tasks/7/0/1382850307/__FUSE/mount_path_ccf18fa7-3c1f-4017-a724-9e910c2ec9b1/util/system/condvar.h:50
#4  TCondVar::Wait (this=0x474c3fb000f0, m=...) at /place/sandbox-data/tasks/7/0/1382850307/__FUSE/mount_path_ccf18fa7-3c1f-4017-a724-9e910c2ec9b1/util/system/condvar.h:60
#5  TAdaptiveThreadPool::TImpl::Add (this=0x474c3fb000b0, obj=0x474c3c54d200) at /place/sandbox-data/tasks/7/0/1382850307/__FUSE/mount_path_ccf18fa7-3c1f-4017-a724-9e910c2ec9b1/util/thread/pool.cpp:448
#6  0x00000000025e3b5a in TAdaptiveThreadPool::Add (this=<optimized out>, obj=0x80) at /place/sandbox-data/tasks/7/0/1382850307/__FUSE/mount_path_ccf18fa7-3c1f-4017-a724-9e910c2ec9b1/util/thread/pool.cpp:570
#7  0x0000000003ae46e9 in NYdb::TGRpcConnectionsImpl::EnqueueResponse (this=0x474c3f80c180, action=0x474c3c54d200) at /place/sandbox-data/tasks/7/0/1382850307/__FUSE/mount_path_ccf18fa7-3c1f-4017-a724-9e910c2ec9b1/ydb/public/sdk/cpp/client/impl/ydb_internal/grpc_connections/grpc_connections.cpp:421
#8  NYdb::TGRpcConnectionsImpl::ScheduleOneTimeTask(std::__y1::function<void ()>&&, TDuration)::$_1::operator()(NYql::TIssues&&, NYdb::EStatus) (this=0x7fec6dd3fcb0, status=<optimized out>) at /place/sandbox-data/tasks/7/0/1382850307/__FUSE/mount_path_ccf18fa7-3c1f-4017-a724-9e910c2ec9b1/ydb/public/sdk/cpp/client/impl/ydb_internal/grpc_connections/grpc_connections.cpp:229
#9  0x0000000003ae4445 in NYdb::TGRpcConnectionsImpl::ScheduleOneTimeTask(std::__y1::function<void ()>&&, TDuration) (this=0x474c3f80c180, fn=..., timeout=...) at /place/sandbox-data/tasks/7/0/1382850307/__FUSE/mount_path_ccf18fa7-3c1f-4017-a724-9e910c2ec9b1/ydb/public/sdk/cpp/client/impl/ydb_internal/grpc_connections/grpc_connections.cpp:238
#10 0x0000000003bb6480 in NYdb::NTable::TSessionPoolImpl::CreateFakeSession (promise=..., client=...) at /place/sandbox-data/tasks/7/0/1382850307/__FUSE/mount_path_ccf18fa7-3c1f-4017-a724-9e910c2ec9b1/ydb/public/sdk/cpp/client/ydb_table/table.cpp:2869
#11 0x0000000003bb6cc3 in NYdb::NTable::TSessionPoolImpl::GetSession (this=0x474c3fbc34c0, client=warning: RTTI symbol not found for class 'std::__y1::__shared_ptr_pointer<NYdb::NTable::TTableClient::TImpl*, std::__y1::shared_ptr<NYdb::NTable::TTableClient::TImpl>::__shared_ptr_default_delete<NYdb::NTable::TTableClient::TImpl, NYdb::NTable::TTableClient::TImpl>, std::__y1::allocator<NYdb::NTable::TTableClient::TImpl> >'
warning: RTTI symbol not found for class 'std::__y1::__shared_ptr_pointer<NYdb::NTable::TTableClient::TImpl*, std::__y1::shared_ptr<NYdb::NTable::TTableClient::TImpl>::__shared_ptr_default_delete<NYdb::NTable::TTableClient::TImpl, NYdb::NTable::TTableClient::TImpl>, std::__y1::allocator<NYdb::NTable::TTableClient::TImpl> >'
std::__y1::shared_ptr (count 3717, weak 5) = 0x474c3fbc33c0, settings=..., sessionProvider=0x3bb5140 <NYdb::NTable::TTableClient::TImpl::SettlerAwareSessonProvider(std::__y1::shared_ptr<NYdb::NTable::TTableClient::TImpl>, NYdb::NTable::TCreateSessionSettings const&)>) at /place/sandbox-data/tasks/7/0/1382850307/__FUSE/mount_path_ccf18fa7-3c1f-4017-a724-9e910c2ec9b1/ydb/public/sdk/cpp/client/ydb_table/table.cpp:2905
#12 0x0000000003bb95df in NYdb::NTable::TTableClient::TImpl::GetSession (this=<optimized out>, settings=...) at /place/sandbox-data/tasks/7/0/1382850307/__FUSE/mount_path_ccf18fa7-3c1f-4017-a724-9e910c2ec9b1/ydb/public/sdk/cpp/client/ydb_table/table.cpp:1854
#13 0x0000000003be4158 in NYdb::NTable::TTableClient::GetSession (this=0x474c0acf6748, settings=...) at /place/sandbox-data/tasks/7/0/1382850307/__FUSE/mount_path_ccf18fa7-3c1f-4017-a724-9e910c2ec9b1/ydb/public/sdk/cpp/client/ydb_table/table.cpp:3132
#14 NYdb::NTable::TRetryOperationWithSession::Execute (this=0x474c0acf66f0) at /place/sandbox-data/tasks/7/0/1382850307/__FUSE/mount_path_ccf18fa7-3c1f-4017-a724-9e910c2ec9b1/ydb/public/sdk/cpp/client/ydb_table/table.cpp:3290
#15 0x0000000003bb98a6 in NYdb::NTable::TTableClient::RetryOperation(std::__y1::function<NThreading::TFuture<NYdb::TStatus> (NYdb::NTable::TSession)>&&, NYdb::NTable::TRetryOperationSettings const&) (this=<optimized out>, operation=..., settings=...) at /place/sandbox-data/tasks/7/0/1382850307/__FUSE/mount_path_ccf18fa7-3c1f-4017-a724-9e910c2ec9b1/ydb/public/sdk/cpp/client/ydb_table/table.cpp:3329

// table client call from here
#16 0x0000000003b89a50 in NMatrix::NNotificator::TConnectionsStorage::UpdateConnectionsWithFullStateRemoveAll (this=<optimized out>, endpoint=..., logContext=..., metrics=...) at /place/sandbox-data/tasks/7/0/1382850307/__FUSE/mount_path_ccf18fa7-3c1f-4017-a724-9e910c2ec9b1/alice/matrix/notificator/library/storages/connections/storage.cpp:320
// my code stack

Треды ydb, кажется, тоже все заблокированы на add в этот thread pool:

Thread 14 (Thread 0x7fec72d4b700 (LWP 158)):
#0  0x00007fec79b27ad3 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00000000025e3e34 in TCondVar::TImpl::WaitD (this=0x474c3fc9673c, lock=..., deadLine=...) at /place/sandbox-data/tasks/7/0/1382850307/__FUSE/mount_path_ccf18fa7-3c1f-4017-a724-9e910c2ec9b1/util/system/condvar.cpp:99
#2  TCondVar::WaitD (this=0x474c3fb000f0, mutex=..., deadLine=...) at /place/sandbox-data/tasks/7/0/1382850307/__FUSE/mount_path_ccf18fa7-3c1f-4017-a724-9e910c2ec9b1/util/system/condvar.cpp:144
#3  TCondVar::WaitI (this=0x474c3fb000f0, m=...) at /place/sandbox-data/tasks/7/0/1382850307/__FUSE/mount_path_ccf18fa7-3c1f-4017-a724-9e910c2ec9b1/util/system/condvar.h:50
#4  TCondVar::Wait (this=0x474c3fb000f0, m=...) at /place/sandbox-data/tasks/7/0/1382850307/__FUSE/mount_path_ccf18fa7-3c1f-4017-a724-9e910c2ec9b1/util/system/condvar.h:60
#5  TAdaptiveThreadPool::TImpl::Add (this=0x474c3fb000b0, obj=0x474c37206040) at /place/sandbox-data/tasks/7/0/1382850307/__FUSE/mount_path_ccf18fa7-3c1f-4017-a724-9e910c2ec9b1/util/thread/pool.cpp:448
#6  0x00000000025e3b5a in TAdaptiveThreadPool::Add (this=<optimized out>, obj=0x80) at /place/sandbox-data/tasks/7/0/1382850307/__FUSE/mount_path_ccf18fa7-3c1f-4017-a724-9e910c2ec9b1/util/thread/pool.cpp:570
#7  0x0000000003cb6ca7 in NYdb::TGRpcConnectionsImpl::EnqueueResponse (this=0x474c3f80c180, action=0x474c37206040) at /place/sandbox-data/tasks/7/0/1382850307/__FUSE/mount_path_ccf18fa7-3c1f-4017-a724-9e910c2ec9b1/ydb/public/sdk/cpp/client/impl/ydb_internal/grpc_connections/grpc_connections.cpp:421
#8  NYdb::TGRpcConnectionsImpl::Run<Ydb::Table::V1::TableService, Ydb::Table::ExecuteDataQueryRequest, Ydb::Table::ExecuteDataQueryResponse>(Ydb::Table::ExecuteDataQueryRequest&&, std::__y1::function<void (Ydb::Table::ExecuteDataQueryResponse*, NYdb::TPlainStatus)>&&, NGrpc::TSimpleRequestProcessor<Ydb::Table::V1::TableService::Stub, Ydb::Table::ExecuteDataQueryRequest, Ydb::Table::ExecuteDataQueryResponse>::TAsyncRequest, std::__y1::shared_ptr<NYdb::TDbDriverState>, NYdb::TRpcRequestSettings const&, TDuration, NYdb::TEndpointKey const&, std::__y1::shared_ptr<NGrpc::IQueueClientContext>)::{lambda(NYdb::TPlainStatus, std::__y1::unique_ptr<NGrpc::TServiceConnection<Ydb::Table::V1::TableService>, std::__y1::default_delete<NGrpc::TServiceConnection<Ydb::Table::V1::TableService> > >, NYdb::TEndpointKey)#1}::operator()(NYdb::TPlainStatus, std::__y1::unique_ptr<NGrpc::TServiceConnection<Ydb::Table::V1::TableService>, std::__y1::default_delete<NGrpc::TServiceConnection<Ydb::Table::V1::TableService> > >, NYdb::TEndpointKey)::{lambda(grpc::ClientContext const&, NGrpc::TGrpcStatus&&, Ydb::Table::ExecuteDataQueryResponse&&)#1}::operator()(grpc::ClientContext const, NGrpc::TGrpcStatus, NGrpc::TGrpcStatus&&) (this=<optimized out>, ctx=..., grpcStatus=..., response=...) at /place/sandbox-data/tasks/7/0/1382850307/__FUSE/mount_path_ccf18fa7-3c1f-4017-a724-9e910c2ec9b1/ydb/public/sdk/cpp/client/impl/ydb_internal/grpc_connections/grpc_connections.h:250
#9  0x0000000003cb876c in std::__y1::__function::__value_func<void (grpc::ClientContext const&, NGrpc::TGrpcStatus&&, Ydb::Table::ExecuteDataQueryResponse&&)>::operator()(grpc::ClientContext const&, NGrpc::TGrpcStatus&&, Ydb::Table::ExecuteDataQueryResponse&&) const (this=0x474c3fbf5d30, __args=..., __args=..., __args=...) at /place/sandbox-data/tasks/7/0/1382850307/__FUSE/mount_path_ccf18fa7-3c1f-4017-a724-9e910c2ec9b1/contrib/libs/cxxsupp/libcxx/include/__functional/function.h:508
#10 std::__y1::function<void (grpc::ClientContext const&, NGrpc::TGrpcStatus&&, Ydb::Table::ExecuteDataQueryResponse&&)>::operator()(grpc::ClientContext const&, NGrpc::TGrpcStatus&&, Ydb::Table::ExecuteDataQueryResponse&&) const (this=0x474c3fbf5d30, __arg=..., __arg=..., __arg=...) at /place/sandbox-data/tasks/7/0/1382850307/__FUSE/mount_path_ccf18fa7-3c1f-4017-a724-9e910c2ec9b1/contrib/libs/cxxsupp/libcxx/include/__functional/function.h:1192
#11 NGrpc::TAdvancedRequestProcessor<Ydb::Table::V1::TableService::Stub, Ydb::Table::ExecuteDataQueryRequest, Ydb::Table::ExecuteDataQueryResponse>::Execute (this=0x474c3fbf5b80, ok=true) at /place/sandbox-data/tasks/7/0/1382850307/__FUSE/mount_path_ccf18fa7-3c1f-4017-a724-9e910c2ec9b1/library/cpp/grpc/client/grpc_client_low.h:344
#12 0x0000000003b36a18 in NGrpc::PullEvents (cq=0x474c3fc07a80) at /place/sandbox-data/tasks/7/0/1382850307/__FUSE/mount_path_ccf18fa7-3c1f-4017-a724-9e910c2ec9b1/library/cpp/grpc/client/grpc_client_low.cpp:194
#13 NGrpc::TGRpcClientLow::Init(unsigned long)::$_2::operator()() const (this=<optimized out>) at /place/sandbox-data/tasks/7/0/1382850307/__FUSE/mount_path_ccf18fa7-3c1f-4017-a724-9e910c2ec9b1/library/cpp/grpc/client/grpc_client_low.cpp:430
#14 std::__y1::__invoke<NGrpc::TGRpcClientLow::Init(unsigned long)::$_2&>(NGrpc::TGRpcClientLow::Init(unsigned long)::$_2&) (__f=...) at /place/sandbox-data/tasks/7/0/1382850307/__FUSE/mount_path_ccf18fa7-3c1f-4017-a724-9e910c2ec9b1/contrib/libs/cxxsupp/libcxx/include/type_traits:3671
#15 std::__y1::__invoke_void_return_wrapper<void, true>::__call<NGrpc::TGRpcClientLow::Init(unsigned long)::$_2&>(NGrpc::TGRpcClientLow::Init(unsigned long)::$_2&) (__args=...) at /place/sandbox-data/tasks/7/0/1382850307/__FUSE/mount_path_ccf18fa7-3c1f-4017-a724-9e910c2ec9b1/contrib/libs/cxxsupp/libcxx/include/__functional/invoke.h:61
#16 std::__y1::__function::__alloc_func<NGrpc::TGRpcClientLow::Init(unsigned long)::$_2, std::__y1::allocator<NGrpc::TGRpcClientLow::Init(unsigned long)::$_2>, void ()>::operator()() (this=<optimized out>) at /place/sandbox-data/tasks/7/0/1382850307/__FUSE/mount_path_ccf18fa7-3c1f-4017-a724-9e910c2ec9b1/contrib/libs/cxxsupp/libcxx/include/__functional/function.h:181
#17 std::__y1::__function::__func<NGrpc::TGRpcClientLow::Init(unsigned long)::$_2, std::__y1::allocator<NGrpc::TGRpcClientLow::Init(unsigned long)::$_2>, void ()>::operator()() (this=<optimized out>) at /place/sandbox-data/tasks/7/0/1382850307/__FUSE/mount_path_ccf18fa7-3c1f-4017-a724-9e910c2ec9b1/contrib/libs/cxxsupp/libcxx/include/__functional/function.h:355
#18 0x00000000025e5546 in std::__y1::__function::__value_func<void ()>::operator()() const (this=0x474c3fc0c4d0) at /place/sandbox-data/tasks/7/0/1382850307/__FUSE/mount_path_ccf18fa7-3c1f-4017-a724-9e910c2ec9b1/contrib/libs/cxxsupp/libcxx/include/__functional/function.h:508
#19 std::__y1::function<void ()>::operator()() const (this=0x474c3fc0c4d0) at /place/sandbox-data/tasks/7/0/1382850307/__FUSE/mount_path_ccf18fa7-3c1f-4017-a724-9e910c2ec9b1/contrib/libs/cxxsupp/libcxx/include/__functional/function.h:1192
#20 (anonymous namespace)::TThreadFactoryFuncObj::DoExecute (this=0x474c3fc0c4c0) at /place/sandbox-data/tasks/7/0/1382850307/__FUSE/mount_path_ccf18fa7-3c1f-4017-a724-9e910c2ec9b1/util/thread/factory.cpp:61
#21 0x00000000025e5967 in IThreadFactory::IThreadAble::Execute (this=0x474c3fc9673c) at /place/sandbox-data/tasks/7/0/1382850307/__FUSE/mount_path_ccf18fa7-3c1f-4017-a724-9e910c2ec9b1/util/thread/factory.h:15
#22 (anonymous namespace)::TSystemThreadFactory::TPoolThread::ThreadProc (func=0x474c3fc9673c) at /place/sandbox-data/tasks/7/0/1382850307/__FUSE/mount_path_ccf18fa7-3c1f-4017-a724-9e910c2ec9b1/util/thread/factory.cpp:36
#23 0x00000000022149a6 in (anonymous namespace)::TPosixThread::ThreadProxy (arg=0x474c3fc950c0) at /place/sandbox-data/tasks/7/0/1382850307/__FUSE/mount_path_ccf18fa7-3c1f-4017-a724-9e910c2ec9b1/util/system/thread.cpp:229
#24 0x00007fec79b216db in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#25 0x00007fec7984a71f in clone () from /lib/x86_64-linux-gnu/libc.so.6

(Но тут я проглядел бегло, честно скажу, мог что-то пропустить)

К сожалению сглупил и не оставил core файл (есть только дамп бектрейса всех тредов).

Можете подсказать что я делаю не так и как можно избежать этого deadlock’а?

kardymonds commented 7 months ago

[Егор Чунаев] [22 июл 2022, 00:35] При таком setup’е это первый deadlock примерно за год

[Олег Бондарь] призвал Даниила Чередника [22 июл 2022, 06:43] Привет, можешь помочь?

[Денис Воркожоков] [22 июл 2022, 10:30] Ловили тоже похожее что-то, кстати, вдруг полезно будет:{ссылки paste удалены}

Там сходу выглядело так, что добавляющие потоки уснули на этом кондваре: arcadia/util/thread/pool.cpp?rev=r9383818#L449 А воркеры на этом: .../trunk/arcadia/util/thread/pool.cpp?rev=r9383818#L519

[Егор Чунаев] призвал Даниила Чередника [11 авг 2022, 03:47] Привет

Вижу что вышел из отпуска, не потеряешь этот тикет?

Отредактировал [Егор Чунаев] 11 авг 2022, 03:51

[Даниил Чередник] [11 авг 2022, 11:34] Привет

Вижу что вышел из отпуска, не потеряешь этот тикет?

Да я видел. Но пока не знаю как подступить, как будто бы что то в TAdaptiveThreadPool, но сходу не понятно что... Это не на остановке программы произошло? Нет ли каких проблем с системой (лимиты на файловые дескрипторы, etc)...

[Егор Чунаев] призвал Дениса Воркожокова [11 авг 2022, 11:50] Да я видел. Но пока не знаю как подступить, как будто бы что то в TAdaptiveThreadPool, но сходу не понятно что...

Эх, зря поспешил и не dumpнул корку на диск :(

Это не на остановке программы произошло? Нет ли каких проблем с системой (лимиты на файловые дескрипторы, etc)...

Вроде нет, под в няне был в нормальном состоянии (cpu и unevict mem потреблял <30%, общий thread count в поде был существенно ниже 10k (hard limit’а на треды в RTC)), остальные sidecar’ы (push_client, tvmtool) в нем продолжали работать. На дескрипторы, честно признаюсь, не посмотрел, но мое приложение не должно особо их потреблять.

[Денис Воркожоков]привет, а у тебя тоже только backtrace’ы или есть корка?

[Денис Воркожоков] [11 авг 2022, 11:58] Denis Vorkozhokov привет, а у тебя тоже только backtrace’ы или есть корка?

Только трейсы :(

Да я видел. Но пока не знаю как подступить, как будто бы что то в TAdaptiveThreadPool, но сходу не понятно что...

ага, мы усиленно с коллегой смотрели в код и так и не поняли, что там не так(

Отредактировал [Денис Воркожоков] 11 авг 2022, 11:58

[Ренат Няжеметдинов] призвал Егора Чунаева [24 окт 2022, 11:59] Егор, привет!

Нет апдейтов по этой задаче? Не ловили больше эту проблему? Новых деталей нет?

[Егор Чунаев] призвал Дениса Воркожокова [24 окт 2022, 14:49] Привет

Я, к счастью, не ловил. Как писал в комментарии выше - это был первый deadlock за год примерно (и я тут очень сглупил и не задампил корку :( ).

Может, у [vorkdenis@]( что-то появилось за это время.

[Денис Воркожоков] [24 окт 2022, 15:42] Неа, тоже не стреляло

[Ренат Няжеметдинов] призвал Дениса Воркожокова Егора Чунаева [24 окт 2022, 15:54] Парни, тогда этот тикет закрываю. Если выстрелит - welcome со всеми деталями, будем разбираться.

[Егор Чунаев] [24 окт 2022, 16:02] Ок, договорились

kardymonds commented 7 months ago

[Дмитрий Кардымон] [15 мар, 11:30] У нас (Yandex Query) воспроизводится:

Примерный лог одного случая:

-- рестарт пода
2024-03-14T02:46:18.833816Z Bootstrap, InstanceId: fe8599c0-403d8f98-b5fc9ab8-237970e8
...
работает всё 
...
2024-03-14T06:14:57.342358Z Последняя успешная операция с TTableClient
...
-- для pqv1 / ReadSession появляются ошибки и далее они постоянно
2024-03-14T06:26:51.893912Z 2024-03-14T06:26:51.893909Z :ERROR: [/ru-central1/yc.yaem.service-cloud/etnkivt59agm6jmkqnh6] [/ru-central1/yc.yaem.service-cloud/etnkivt59agm6jmkqnh6] [2e7e280f-7084f2f7-f310a060-988b6f95] [null] Got error. Status: CLIENT_LIMITS_REACHED. Description: <main>: Error: Requests queue limit reached

-- рестарт пода, далее всё работает
2024-03-14T09:16:25.393697Z Bootstrap, InstanceId: 46fad4c8-4660722b-45349ea4-7930f714

лог YDB_SDK приложил csqufdni8td23ekvkd51 (26).csv

kardymonds commented 7 months ago

Мы сняли стеки для двух случаев и оказалось что наш дедлок возможно это другая проблема. Стек первого случая (показаны только 2 интересных потока):

// PushEvent
Thread 21 (Thread 0x7f2d9a970700 (LWP 38)):
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007f2da5271e42 in __GI___pthread_mutex_lock (mutex=0x44b5d02786a0) at ../nptl/pthread_mutex_lock.c:115
#2  0x000056391dbe212d in TMutex::TImpl::Acquire (this=0x44b5d02786a0) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/util/system/mutex.cpp:62
#3  TMutex::Acquire (this=<optimized out>) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/util/system/mutex.cpp:133
#4  0x0000563921258987 in TCommonLockOps<TMutex>::Acquire (t=0x44b5f117c9e0) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/util/system/guard.h:9
#5  TGuard<TMutex, TCommonLockOps<TMutex> >::Init (t=0x44b5f117c9e0, this=<optimized out>) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/util/system/guard.h:81
#6  TGuard<TMutex, TCommonLockOps<TMutex> >::TGuard (t=0x44b5f117c9e0, this=<optimized out>) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/util/system/guard.h:46
#7  Guard<TMutex> (t=...) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/util/system/guard.h:96
#8  NYdb::NPersQueue::TReadSessionEventsQueue<false>::PushEvent (this=this@entry=0x44b5f117c978, stream=<error reading variable: Cannot access memory at address 0x80>, event=<error reading variable: Cannot access memory at address 0x0>, deferred=...) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/public/sdk/cpp/client/ydb_persqueue_core/impl/read_session.ipp:1856
#9  0x0000563921269856 in NYdb::NPersQueue::TSingleClusterReadSessionImpl<false>::OnReadDoneImpl<Ydb::Topic::StreamReadMessage_StartPartitionSessionRequest>(Ydb::Topic::StreamReadMessage_StartPartitionSessionRequest&&, NYdb::NPersQueue::TDeferredActions<false>&) (this=this@entry=0x44b5dcfb9120, msg=<optimized out>, deferred=...) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/public/sdk/cpp/client/ydb_persqueue_core/impl/read_session.ipp:1322
#10 0x000056392126843a in NYdb::NPersQueue::TSingleClusterReadSessionImpl<false>::OnReadDone(NYdbGrpc::TGrpcStatus&&, unsigned long) (this=0x44b5dcfb9120, grpcStatus=grpcStatus@entry=<unknown type in /opt/kikimr/bin/kikimr, CU 0x1b618b4b, DIE 0x1b804249>, connectionGeneration=2) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/public/sdk/cpp/client/ydb_persqueue_core/impl/read_session.ipp:899
#11 0x0000563921267ede in NYdb::NPersQueue::TSingleClusterReadSessionImpl<false>::ReadFromProcessorImpl(NYdb::NPersQueue::TDeferredActions<false>&)::{lambda(NYdbGrpc::TGrpcStatus&&)#1}::operator()(NYdbGrpc::TGrpcStatus&&) const (this=0x44b63fe62388, grpcStatus=<unknown type in /opt/kikimr/bin/kikimr, CU 0x1b618b4b, DIE 0x1b803dc6>) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/public/sdk/cpp/client/ydb_persqueue_core/impl/read_session.ipp:837
#12 0x00005639212278e3 in std::__y1::__function::__value_func<void (NYdbGrpc::TGrpcStatus&&)>::operator()[abi:ue170006](NYdbGrpc::TGrpcStatus&&) const (this=0x7f2d9a939e10, __args=<unknown type in /opt/kikimr/bin/kikimr, CU 0x1b4e86b9, DIE 0x1b5b7b18>) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/__functional/function.h:517
#13 std::__y1::function<void (NYdbGrpc::TGrpcStatus&&)>::operator()(NYdbGrpc::TGrpcStatus&&) const (this=0x7f2d9a939e10, __arg=<unknown type in /opt/kikimr/bin/kikimr, CU 0x1b4e86b9, DIE 0x1b5b7af6>) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/__functional/function.h:1168
#14 NYdbGrpc::TStreamRequestReadWriteProcessor<Ydb::Topic::V1::TopicService::Stub, Ydb::Topic::StreamReadMessage_FromClient, Ydb::Topic::StreamReadMessage_FromServer>::OnReadDone (this=<optimized out>, ok=<optimized out>) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/library/grpc/client/grpc_client_low.h:1104
#15 0x000056392122ba26 in NYdbGrpc::TQueueClientFixedEvent<NYdbGrpc::TStreamRequestReadWriteProcessor<Ydb::Topic::V1::TopicService::Stub, Ydb::Topic::StreamReadMessage_FromClient, Ydb::Topic::StreamReadMessage_FromServer> >::Execute (this=<optimized out>, ok=<optimized out>) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/library/grpc/client/grpc_client_low.h:76
#16 0x000056391f9a3c1a in NYdbGrpc::PullEvents (cq=0x44b5ffcf7800) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/library/grpc/client/grpc_client_low.cpp:194
#17 0x000056391dbf32ba in std::__y1::__function::__value_func<void ()>::operator()[abi:ue170006]() const (this=0x44b5ffcf7890) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/__functional/function.h:517
#18 std::__y1::function<void ()>::operator()() const (this=0x44b5ffcf7890) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/__functional/function.h:1168
#19 (anonymous namespace)::TThreadFactoryFuncObj::DoExecute (this=0x44b5ffcf7880) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/util/thread/factory.cpp:61
#20 0x000056391dbf377d in IThreadFactory::IThreadAble::Execute (this=0x44b5d02786a0) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/util/thread/factory.h:15
#21 (anonymous namespace)::TSystemThreadFactory::TPoolThread::ThreadProc (func=0x44b5d02786a0) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/util/thread/factory.cpp:36
#22 0x000056391dbeebea in (anonymous namespace)::TPosixThread::ThreadProxy (arg=0x44b5ff2acdc0) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/util/system/thread.cpp:244
#23 0x00007f2da526f6ba in start_thread (arg=0x7f2d9a970700) at pthread_create.c:333
#24 0x00007f2da4da251d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

// OnDecompressionInfoDestroy
Thread 41 (Thread 0x7f2d9053a700 (LWP 58)):
#0  0x00007f2da4d6738d in nanosleep () at ../sysdeps/unix/syscall-template.S:84
#1  0x00007f2da4d98e94 in usleep (useconds=<optimized out>) at ../sysdeps/posix/usleep.c:32
#2  0x000056391dbea9b4 in TSpinWait::Sleep (this=0x7f2d90503058) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/util/system/spin_wait.cpp:36
#3  0x000056392126f4d8 in TAdaptiveLock::Acquire (this=0x44b5dcfb94c8) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/util/system/spinlock.h:116
#4  TCommonLockOps<TAdaptiveLock>::Acquire (t=0x44b5dcfb94c8) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/util/system/guard.h:9
#5  TGuard<TAdaptiveLock, TCommonLockOps<TAdaptiveLock> >::Init (t=0x44b5dcfb94c8, this=<optimized out>) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/util/system/guard.h:81
#6  TGuard<TAdaptiveLock, TCommonLockOps<TAdaptiveLock> >::TGuard (t=0x44b5dcfb94c8, this=<optimized out>) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/util/system/guard.h:46
#7  Guard<TAdaptiveLock> (t=...) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/util/system/guard.h:96
#8  NYdb::NPersQueue::TSingleClusterReadSessionImpl<false>::OnDecompressionInfoDestroy (this=0x44b5dcfb9120, compressedSize=0, decompressedSize=1674, messagesCount=<optimized out>, serverBytesSize=0) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/public/sdk/cpp/client/ydb_persqueue_core/impl/read_session.ipp:1507
#9  0x000056392126f138 in NYdb::NPersQueue::TDataDecompressionInfo<false>::~TDataDecompressionInfo (this=0x44b6196b7418) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/public/sdk/cpp/client/ydb_persqueue_core/impl/read_session.ipp:2247
#10 0x0000563921259dcd in std::__y1::__shared_count::__release_shared[abi:ue170006]() (this=0x44b6196b7400) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/__memory/shared_ptr.h:185
#11 std::__y1::__shared_weak_count::__release_shared[abi:ue170006]() (this=0x44b6196b7400) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/__memory/shared_ptr.h:239
#12 std::__y1::shared_ptr<NYdb::NPersQueue::TDataDecompressionInfo<false> >::~shared_ptr[abi:ue170006]() (this=0x10) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/__memory/shared_ptr.h:798
#13 NYdb::NPersQueue::TDataDecompressionEvent<false>::~TDataDecompressionEvent (this=0x0) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/public/sdk/cpp/client/ydb_persqueue_core/impl/read_session.h:357
#14 std::__y1::__variant_detail::__alt<0ul, NYdb::NPersQueue::TDataDecompressionEvent<false> >::~__alt (this=0x0) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/variant:739
#15 _ZZNSt4__y116__variant_detail6__dtorINS0_8__traitsIJN4NYdb10NPersQueue23TDataDecompressionEventILb0EEENS_7variantIJNS3_6NTopic17TReadSessionEvent18TDataReceivedEventENS9_33TCommitOffsetAcknowledgementEventENS9_27TStartPartitionSessionEventENS9_26TStopPartitionSessionEventENS9_28TPartitionSessionStatusEventENS9_28TPartitionSessionClosedEventENS8_19TSessionClosedEventEEEEEEELNS0_6_TraitE1EE9__destroyB8ue170006EvENKUlRT_E_clINS0_5__altILm0ES6_EEEEDaSM_ (__alt=..., this=<optimized out>) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/variant:878
#16 _ZNSt4__y18__invokeB8ue170006IZNS_16__variant_detail6__dtorINS1_8__traitsIJN4NYdb10NPersQueue23TDataDecompressionEventILb0EEENS_7variantIJNS4_6NTopic17TReadSessionEvent18TDataReceivedEventENSA_33TCommitOffsetAcknowledgementEventENSA_27TStartPartitionSessionEventENSA_26TStopPartitionSessionEventENSA_28TPartitionSessionStatusEventENSA_28TPartitionSessionClosedEventENS9_19TSessionClosedEventEEEEEEELNS1_6_TraitE1EE9__destroyB8ue170006EvEUlRT_E_JRNS1_5__altILm0ES7_EEEEEDTclclsr3stdE7declvalISM_EEspclsr3stdE7declvalIT0_EEEEOSM_DpOSS_ (__args=..., __f=<optimized out>) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/__type_traits/invoke.h:340
#17 _ZNSt4__y116__variant_detail12__visitation6__base12__dispatcherIJLm0EEE10__dispatchB8ue170006IOZNS0_6__dtorINS0_8__traitsIJN4NYdb10NPersQueue23TDataDecompressionEventILb0EEENS_7variantIJNS8_6NTopic17TReadSessionEvent18TDataReceivedEventENSE_33TCommitOffsetAcknowledgementEventENSE_27TStartPartitionSessionEventENSE_26TStopPartitionSessionEventENSE_28TPartitionSessionStatusEventENSE_28TPartitionSessionClosedEventENSD_19TSessionClosedEventEEEEEEELNS0_6_TraitE1EE9__destroyB8ue170006EvEUlRT_E_JRNS0_6__baseILSO_1EJSB_SM_EEEEEEDcSQ_DpT0_ (__f=<optimized out>, __vs=...) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/variant:572
#18 0x00005639212785f2 in _ZNSt4__y116__variant_detail12__visitation6__base11__visit_altB8ue170006IZNS0_6__dtorINS0_8__traitsIJN4NYdb10NPersQueue23TDataDecompressionEventILb0EEENS_7variantIJNS6_6NTopic17TReadSessionEvent18TDataReceivedEventENSC_33TCommitOffsetAcknowledgementEventENSC_27TStartPartitionSessionEventENSC_26TStopPartitionSessionEventENSC_28TPartitionSessionStatusEventENSC_28TPartitionSessionClosedEventENSB_19TSessionClosedEventEEEEEEELNS0_6_TraitE1EE9__destroyB8ue170006EvEUlRT_E_JRSN_EEEDcOSO_DpOT0_ (__visitor=<unknown type in /opt/kikimr/bin/kikimr, CU 0x1b618b4b, DIE 0x1b77c841>, __vs=...) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/variant:535
#19 std::__y1::__variant_detail::__dtor<std::__y1::__variant_detail::__traits<NYdb::NPersQueue::TDataDecompressionEvent<false>, std::__y1::variant<NYdb::NTopic::TReadSessionEvent::TDataReceivedEvent, NYdb::NTopic::TReadSessionEvent::TCommitOffsetAcknowledgementEvent, NYdb::NTopic::TReadSessionEvent::TStartPartitionSessionEvent, NYdb::NTopic::TReadSessionEvent::TStopPartitionSessionEvent, NYdb::NTopic::TReadSessionEvent::TPartitionSessionStatusEvent, NYdb::NTopic::TReadSessionEvent::TPartitionSessionClosedEvent, NYdb::NTopic::TSessionClosedEvent> >, (std::__y1::__variant_detail::_Trait)1>::__destroy[abi:ue170006]() (this=0x0) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/variant:878
#20 std::__y1::__variant_detail::__dtor<std::__y1::__variant_detail::__traits<NYdb::NPersQueue::TDataDecompressionEvent<false>, std::__y1::variant<NYdb::NTopic::TReadSessionEvent::TDataReceivedEvent, NYdb::NTopic::TReadSessionEvent::TCommitOffsetAcknowledgementEvent, NYdb::NTopic::TReadSessionEvent::TStartPartitionSessionEvent, NYdb::NTopic::TReadSessionEvent::TStopPartitionSessionEvent, NYdb::NTopic::TReadSessionEvent::TPartitionSessionStatusEvent, NYdb::NTopic::TReadSessionEvent::TPartitionSessionClosedEvent, NYdb::NTopic::TSessionClosedEvent> >, (std::__y1::__variant_detail::_Trait)1>::~__dtor (this=0x0) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/variant:878
#21 std::__y1::variant<NYdb::NPersQueue::TDataDecompressionEvent<false>, std::__y1::variant<NYdb::NTopic::TReadSessionEvent::TDataReceivedEvent, NYdb::NTopic::TReadSessionEvent::TCommitOffsetAcknowledgementEvent, NYdb::NTopic::TReadSessionEvent::TStartPartitionSessionEvent, NYdb::NTopic::TReadSessionEvent::TStopPartitionSessionEvent, NYdb::NTopic::TReadSessionEvent::TPartitionSessionStatusEvent, NYdb::NTopic::TReadSessionEvent::TPartitionSessionClosedEvent, NYdb::NTopic::TSessionClosedEvent> >::~variant[abi:ue170006]() (this=0x44b5de612610) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/variant:1399
#22 NYdb::NPersQueue::TRawPartitionStreamEvent<false>::~TRawPartitionStreamEvent (this=0x44b5de612610) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/public/sdk/cpp/client/ydb_persqueue_core/impl/read_session.h:440
#23 std::__y1::allocator<NYdb::NPersQueue::TRawPartitionStreamEvent<false> >::destroy[abi:ue170006](NYdb::NPersQueue::TRawPartitionStreamEvent<false>*) (this=0x44b5cca38cf8, __p=0x44b5de612610) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/__memory/allocator.h:172
#24 std::__y1::allocator_traits<std::__y1::allocator<NYdb::NPersQueue::TRawPartitionStreamEvent<false> > >::destroy[abi:ue170006]<NYdb::NPersQueue::TRawPartitionStreamEvent<false>, void>(std::__y1::allocator<NYdb::NPersQueue::TRawPartitionStreamEvent<false> >&, NYdb::NPersQueue::TRawPartitionStreamEvent<false>*) (__a=..., __p=0x44b5de612610) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/__memory/allocator_traits.h:315
#25 std::__y1::deque<NYdb::NPersQueue::TRawPartitionStreamEvent<false>, std::__y1::allocator<NYdb::NPersQueue::TRawPartitionStreamEvent<false> > >::clear (this=0x44b5cca38cd0) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/deque:2868
#26 NYdb::NPersQueue::TRawPartitionStreamEventQueue<false>::clear (this=0x44b5cca38cd0) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/public/sdk/cpp/client/ydb_persqueue_core/impl/read_session.h:533
#27 0x0000563921247485 in NYdb::NPersQueue::TPartitionStreamImpl<false>::ClearQueue (this=<optimized out>) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/public/sdk/cpp/client/ydb_persqueue_core/impl/read_session.h:739
#28 NYdb::NPersQueue::TReadSessionEventsQueue<false>::Close (this=0x44b5f117c978, event=..., deferred=...) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/public/sdk/cpp/client/ydb_persqueue_core/impl/read_session.h:806
#29 0x0000563921248b1b in NYdb::NTopic::TReadSession::AbortImpl(NYdb::NTopic::TSessionClosedEvent&&, NYdb::NPersQueue::TDeferredActions<false>&) (this=this@entry=0x44b6324ea420, closeEvent=closeEvent@entry=<unknown type in /opt/kikimr/bin/kikimr, CU 0x1b618b4b, DIE 0x1b752f01>, deferred=...) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/public/sdk/cpp/client/ydb_topic/impl/read_session.cpp:367
#30 0x0000563921248cd6 in NYdb::NTopic::TReadSession::AbortImpl(NYdb::EStatus, NYql::TIssues&&, NYdb::NPersQueue::TDeferredActions<false>&) (this=this@entry=0x44b6324ea420, statusCode=statusCode@entry=NYdb::EStatus::ABORTED, issues=issues@entry=<unknown type in /opt/kikimr/bin/kikimr, CU 0x1b618b4b, DIE 0x1b75126e>, deferred=...) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/public/sdk/cpp/client/ydb_topic/impl/read_session.cpp:374
#31 0x00005639212427d0 in NYdb::NTopic::TReadSession::AbortImpl (this=this@entry=0x44b6324ea420, statusCode=statusCode@entry=NYdb::EStatus::ABORTED, message=..., deferred=...) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/public/sdk/cpp/client/ydb_topic/impl/read_session.cpp:382
#32 0x00005639212460ef in NYdb::NTopic::TReadSession::Close (this=0x44b6324ea420, timeout=...) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/public/sdk/cpp/client/ydb_topic/impl/read_session.cpp:274
#33 0x00005639300e70fa in NYql::NDq::TDqPqReadActor::PassAway (this=0x44b5df4b3300) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/library/yql/providers/pq/async_io/dq_pq_read_actor.cpp:263
#34 0x00005639301922bc in NYql::NDq::TDqComputeActorBase<NYql::NDq::TDqAsyncComputeActor, NYql::NDq::TComputeActorAsyncInputHelperForTaskRunnerActor>::Terminate (this=this@entry=0x44b63d02a800, success=false, issues=...) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/library/yql/dq/actors/compute/dq_compute_actor_impl.h:458
#35 0x000056393018e0b4 in NYql::NDq::TDqComputeActorBase<NYql::NDq::TDqAsyncComputeActor, NYql::NDq::TComputeActorAsyncInputHelperForTaskRunnerActor>::HandleExecuteBase (this=0x44b63d02a800, ev=...) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/library/yql/dq/actors/compute/dq_compute_actor_impl.h:1127
#36 0x0000563930193fd0 in NYql::NDq::TDqComputeActorBase<NYql::NDq::TDqAsyncComputeActor, NYql::NDq::TComputeActorAsyncInputHelperForTaskRunnerActor>::StateFuncWrapper<&NYql::NDq::TDqAsyncComputeActor::StateFuncBody> (this=0x7f2d90502ff0, ev=...) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/library/yql/dq/actors/compute/dq_compute_actor_impl.h:225
#37 0x000056391e1175ad in NActors::TGenericExecutorThread::Execute<NActors::TMailboxTable::THTSwapMailbox> (this=this@entry=0x44b5fd610800, mailbox=0x44b5fc477cc0, hint=hint@entry=3571, isTailExecution=<optimized out>) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/library/actors/core/executor_thread.cpp:244
#38 0x000056391e10f62b in NActors::TGenericExecutorThread::ProcessExecutorPool(NActors::IExecutorPool*)::$_0::operator()(unsigned int, bool) const (this=this@entry=0x7f2d90503e00, activation=activation@entry=3571, isTailExecution=false) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/library/actors/core/executor_thread.cpp:425
#39 0x000056391e10efce in NActors::TGenericExecutorThread::ProcessExecutorPool (this=this@entry=0x44b5fd610800, pool=<optimized out>) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/library/actors/core/executor_thread.cpp:478
#40 0x000056391e10fe78 in NActors::TExecutorThread::ThreadProc (this=0x44b5fd610800) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/library/actors/core/executor_thread.cpp:504
#41 0x000056391dbeebea in (anonymous namespace)::TPosixThread::ThreadProxy (arg=0x44b5fb373810) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/util/system/thread.cpp:244
#42 0x00007f2da526f6ba in start_thread (arg=0x7f2d9053a700) at pthread_create.c:333
#43 0x00007f2da4da251d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Стек второго случая:

#0  0x00007f5836e1838d in nanosleep () at ../sysdeps/unix/syscall-template.S:84
#1  0x00007f5836e49e94 in usleep (useconds=<optimized out>) at ../sysdeps/posix/usleep.c:32
#2  0x000055c885f6e9b4 in TSpinWait::Sleep (this=0x7f58225b4058) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/util/system/spin_wait.cpp:36
#3  0x000055c8895f34d8 in TAdaptiveLock::Acquire (this=0x41ee43e13c8) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/util/system/spinlock.h:116
#4  TCommonLockOps<TAdaptiveLock>::Acquire (t=0x41ee43e13c8) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/util/system/guard.h:9
#5  TGuard<TAdaptiveLock, TCommonLockOps<TAdaptiveLock> >::Init (t=0x41ee43e13c8, this=<optimized out>) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/util/system/guard.h:81
#6  TGuard<TAdaptiveLock, TCommonLockOps<TAdaptiveLock> >::TGuard (t=0x41ee43e13c8, this=<optimized out>) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/util/system/guard.h:46
#7  Guard<TAdaptiveLock> (t=...) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/util/system/guard.h:96
#8  NYdb::NPersQueue::TSingleClusterReadSessionImpl<false>::OnDecompressionInfoDestroy (this=0x41ee43e1020, compressedSize=0, decompressedSize=1686, messagesCount=<optimized out>, serverBytesSize=0) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/public/sdk/cpp/client/ydb_persqueue_core/impl/read_session.ipp:1507
#9  0x000055c8895f3138 in NYdb::NPersQueue::TDataDecompressionInfo<false>::~TDataDecompressionInfo (this=0x41ee604deb8) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/public/sdk/cpp/client/ydb_persqueue_core/impl/read_session.ipp:2247
#10 0x000055c8895dddcd in std::__y1::__shared_count::__release_shared[abi:ue170006]() (this=0x41ee604dea0) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/__memory/shared_ptr.h:185
#11 std::__y1::__shared_weak_count::__release_shared[abi:ue170006]() (this=0x41ee604dea0) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/__memory/shared_ptr.h:239
#12 std::__y1::shared_ptr<NYdb::NPersQueue::TDataDecompressionInfo<false> >::~shared_ptr[abi:ue170006]() (this=0x10) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/__memory/shared_ptr.h:798
#13 NYdb::NPersQueue::TDataDecompressionEvent<false>::~TDataDecompressionEvent (this=0x0) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/public/sdk/cpp/client/ydb_persqueue_core/impl/read_session.h:357
#14 std::__y1::__variant_detail::__alt<0ul, NYdb::NPersQueue::TDataDecompressionEvent<false> >::~__alt (this=0x0) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/variant:739
#15 _ZZNSt4__y116__variant_detail6__dtorINS0_8__traitsIJN4NYdb10NPersQueue23TDataDecompressionEventILb0EEENS_7variantIJNS3_6NTopic17TReadSessionEvent18TDataReceivedEventENS9_33TCommitOffsetAcknowledgementEventENS9_27TStartPartitionSessionEventENS9_26TStopPartitionSessionEventENS9_28TPartitionSessionStatusEventENS9_28TPartitionSessionClosedEventENS8_19TSessionClosedEventEEEEEEELNS0_6_TraitE1EE9__destroyB8ue170006EvENKUlRT_E_clINS0_5__altILm0ES6_EEEEDaSM_ (__alt=..., this=<optimized out>) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/variant:878
#16 _ZNSt4__y18__invokeB8ue170006IZNS_16__variant_detail6__dtorINS1_8__traitsIJN4NYdb10NPersQueue23TDataDecompressionEventILb0EEENS_7variantIJNS4_6NTopic17TReadSessionEvent18TDataReceivedEventENSA_33TCommitOffsetAcknowledgementEventENSA_27TStartPartitionSessionEventENSA_26TStopPartitionSessionEventENSA_28TPartitionSessionStatusEventENSA_28TPartitionSessionClosedEventENS9_19TSessionClosedEventEEEEEEELNS1_6_TraitE1EE9__destroyB8ue170006EvEUlRT_E_JRNS1_5__altILm0ES7_EEEEEDTclclsr3stdE7declvalISM_EEspclsr3stdE7declvalIT0_EEEEOSM_DpOSS_ (__args=..., __f=<optimized out>) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/__type_traits/invoke.h:340
#17 _ZNSt4__y116__variant_detail12__visitation6__base12__dispatcherIJLm0EEE10__dispatchB8ue170006IOZNS0_6__dtorINS0_8__traitsIJN4NYdb10NPersQueue23TDataDecompressionEventILb0EEENS_7variantIJNS8_6NTopic17TReadSessionEvent18TDataReceivedEventENSE_33TCommitOffsetAcknowledgementEventENSE_27TStartPartitionSessionEventENSE_26TStopPartitionSessionEventENSE_28TPartitionSessionStatusEventENSE_28TPartitionSessionClosedEventENSD_19TSessionClosedEventEEEEEEELNS0_6_TraitE1EE9__destroyB8ue170006EvEUlRT_E_JRNS0_6__baseILSO_1EJSB_SM_EEEEEEDcSQ_DpT0_ (__f=<optimized out>, __vs=...) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/variant:572
#18 0x000055c8895fc5f2 in _ZNSt4__y116__variant_detail12__visitation6__base11__visit_altB8ue170006IZNS0_6__dtorINS0_8__traitsIJN4NYdb10NPersQueue23TDataDecompressionEventILb0EEENS_7variantIJNS6_6NTopic17TReadSessionEvent18TDataReceivedEventENSC_33TCommitOffsetAcknowledgementEventENSC_27TStartPartitionSessionEventENSC_26TStopPartitionSessionEventENSC_28TPartitionSessionStatusEventENSC_28TPartitionSessionClosedEventENSB_19TSessionClosedEventEEEEEEELNS0_6_TraitE1EE9__destroyB8ue170006EvEUlRT_E_JRSN_EEEDcOSO_DpOT0_ (__visitor=<unknown type in /opt/kikimr/bin/kikimr, CU 0x1b618b4c, DIE 0x1b77c842>, __vs=...) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/variant:535
#19 std::__y1::__variant_detail::__dtor<std::__y1::__variant_detail::__traits<NYdb::NPersQueue::TDataDecompressionEvent<false>, std::__y1::variant<NYdb::NTopic::TReadSessionEvent::TDataReceivedEvent, NYdb::NTopic::TReadSessionEvent::TCommitOffsetAcknowledgementEvent, NYdb::NTopic::TReadSessionEvent::TStartPartitionSessionEvent, NYdb::NTopic::TReadSessionEvent::TStopPartitionSessionEvent, NYdb::NTopic::TReadSessionEvent::TPartitionSessionStatusEvent, NYdb::NTopic::TReadSessionEvent::TPartitionSessionClosedEvent, NYdb::NTopic::TSessionClosedEvent> >, (std::__y1::__variant_detail::_Trait)1>::__destroy[abi:ue170006]() (this=0x0) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/variant:878
#20 std::__y1::__variant_detail::__dtor<std::__y1::__variant_detail::__traits<NYdb::NPersQueue::TDataDecompressionEvent<false>, std::__y1::variant<NYdb::NTopic::TReadSessionEvent::TDataReceivedEvent, NYdb::NTopic::TReadSessionEvent::TCommitOffsetAcknowledgementEvent, NYdb::NTopic::TReadSessionEvent::TStartPartitionSessionEvent, NYdb::NTopic::TReadSessionEvent::TStopPartitionSessionEvent, NYdb::NTopic::TReadSessionEvent::TPartitionSessionStatusEvent, NYdb::NTopic::TReadSessionEvent::TPartitionSessionClosedEvent, NYdb::NTopic::TSessionClosedEvent> >, (std::__y1::__variant_detail::_Trait)1>::~__dtor (this=0x0) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/variant:878
#21 std::__y1::variant<NYdb::NPersQueue::TDataDecompressionEvent<false>, std::__y1::variant<NYdb::NTopic::TReadSessionEvent::TDataReceivedEvent, NYdb::NTopic::TReadSessionEvent::TCommitOffsetAcknowledgementEvent, NYdb::NTopic::TReadSessionEvent::TStartPartitionSessionEvent, NYdb::NTopic::TReadSessionEvent::TStopPartitionSessionEvent, NYdb::NTopic::TReadSessionEvent::TPartitionSessionStatusEvent, NYdb::NTopic::TReadSessionEvent::TPartitionSessionClosedEvent, NYdb::NTopic::TSessionClosedEvent> >::~variant[abi:ue170006]() (this=0x41ef48da900) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/variant:1399
#22 NYdb::NPersQueue::TRawPartitionStreamEvent<false>::~TRawPartitionStreamEvent (this=0x41ef48da900) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/public/sdk/cpp/client/ydb_persqueue_core/impl/read_session.h:440
#23 std::__y1::allocator<NYdb::NPersQueue::TRawPartitionStreamEvent<false> >::destroy[abi:ue170006](NYdb::NPersQueue::TRawPartitionStreamEvent<false>*) (this=0x41ea8f55c78, __p=0x41ef48da900) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/__memory/allocator.h:172
#24 std::__y1::allocator_traits<std::__y1::allocator<NYdb::NPersQueue::TRawPartitionStreamEvent<false> > >::destroy[abi:ue170006]<NYdb::NPersQueue::TRawPartitionStreamEvent<false>, void>(std::__y1::allocator<NYdb::NPersQueue::TRawPartitionStreamEvent<false> >&, NYdb::NPersQueue::TRawPartitionStreamEvent<false>*) (__a=..., __p=0x41ef48da900) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/__memory/allocator_traits.h:315
#25 std::__y1::deque<NYdb::NPersQueue::TRawPartitionStreamEvent<false>, std::__y1::allocator<NYdb::NPersQueue::TRawPartitionStreamEvent<false> > >::clear (this=0x41ea8f55c50) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/deque:2868
#26 NYdb::NPersQueue::TRawPartitionStreamEventQueue<false>::clear (this=0x41ea8f55c50) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/public/sdk/cpp/client/ydb_persqueue_core/impl/read_session.h:533
#27 0x000055c8895cb485 in NYdb::NPersQueue::TPartitionStreamImpl<false>::ClearQueue (this=<optimized out>) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/public/sdk/cpp/client/ydb_persqueue_core/impl/read_session.h:739
#28 NYdb::NPersQueue::TReadSessionEventsQueue<false>::Close (this=0x41eb5568d38, event=..., deferred=...) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/public/sdk/cpp/client/ydb_persqueue_core/impl/read_session.h:806
#29 0x000055c8895ccb1b in NYdb::NTopic::TReadSession::AbortImpl(NYdb::NTopic::TSessionClosedEvent&&, NYdb::NPersQueue::TDeferredActions<false>&) (this=this@entry=0x41e8d5c5420, closeEvent=closeEvent@entry=<unknown type in /opt/kikimr/bin/kikimr, CU 0x1b618b4c, DIE 0x1b752f02>, deferred=...) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/public/sdk/cpp/client/ydb_topic/impl/read_session.cpp:367
#30 0x000055c8895cccd6 in NYdb::NTopic::TReadSession::AbortImpl(NYdb::EStatus, NYql::TIssues&&, NYdb::NPersQueue::TDeferredActions<false>&) (this=this@entry=0x41e8d5c5420, statusCode=statusCode@entry=NYdb::EStatus::ABORTED, issues=issues@entry=<unknown type in /opt/kikimr/bin/kikimr, CU 0x1b618b4c, DIE 0x1b75126f>, deferred=...) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/public/sdk/cpp/client/ydb_topic/impl/read_session.cpp:374
#31 0x000055c8895c67d0 in NYdb::NTopic::TReadSession::AbortImpl (this=this@entry=0x41e8d5c5420, statusCode=statusCode@entry=NYdb::EStatus::ABORTED, message=..., deferred=...) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/public/sdk/cpp/client/ydb_topic/impl/read_session.cpp:382
#32 0x000055c8895ca0ef in NYdb::NTopic::TReadSession::Close (this=0x41e8d5c5420, timeout=...) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/public/sdk/cpp/client/ydb_topic/impl/read_session.cpp:274
#33 0x000055c89847382a in NYql::NDq::TDqPqReadActor::PassAway (this=0x41e9e94f080) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/library/yql/providers/pq/async_io/dq_pq_read_actor.cpp:263
#34 0x000055c89851e9ec in NYql::NDq::TDqComputeActorBase<NYql::NDq::TDqAsyncComputeActor, NYql::NDq::TComputeActorAsyncInputHelperForTaskRunnerActor>::Terminate (this=this@entry=0x41ea9ed0000, success=false, issues=...) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/library/yql/dq/actors/compute/dq_compute_actor_impl.h:458
#35 0x000055c89851a7e4 in NYql::NDq::TDqComputeActorBase<NYql::NDq::TDqAsyncComputeActor, NYql::NDq::TComputeActorAsyncInputHelperForTaskRunnerActor>::HandleExecuteBase (this=0x41ea9ed0000, ev=...) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/library/yql/dq/actors/compute/dq_compute_actor_impl.h:1127
#36 0x000055c898520700 in NYql::NDq::TDqComputeActorBase<NYql::NDq::TDqAsyncComputeActor, NYql::NDq::TComputeActorAsyncInputHelperForTaskRunnerActor>::StateFuncWrapper<&NYql::NDq::TDqAsyncComputeActor::StateFuncBody> (this=0x7f58225b3ff0, ev=...) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/library/yql/dq/actors/compute/dq_compute_actor_impl.h:225
#37 0x000055c88649b5ad in NActors::TGenericExecutorThread::Execute<NActors::TMailboxTable::THTSwapMailbox> (this=this@entry=0x41ebd610800, mailbox=0x41ebc684c40, hint=hint@entry=305, isTailExecution=<optimized out>) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/library/actors/core/executor_thread.cpp:244
#38 0x000055c88649362b in NActors::TGenericExecutorThread::ProcessExecutorPool(NActors::IExecutorPool*)::$_0::operator()(unsigned int, bool) const (this=this@entry=0x7f58225b4e00, activation=activation@entry=305, isTailExecution=false) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/library/actors/core/executor_thread.cpp:425
#39 0x000055c886492fce in NActors::TGenericExecutorThread::ProcessExecutorPool (this=this@entry=0x41ebd610800, pool=<optimized out>) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/library/actors/core/executor_thread.cpp:478
#40 0x000055c886493e78 in NActors::TExecutorThread::ThreadProc (this=0x41ebd610800) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/library/actors/core/executor_thread.cpp:504
#41 0x000055c885f72bea in (anonymous namespace)::TPosixThread::ThreadProxy (arg=0x41ebb4f2f40) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/util/system/thread.cpp:244
#42 0x00007f58373206ba in start_thread (arg=0x7f58225eb700) at pthread_create.c:333
#43 0x00007f5836e5351d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 40 (Thread 0x7f5822dec700 (LWP 57)):
#0  0x00007f5836e1838d in nanosleep () at ../sysdeps/unix/syscall-template.S:84
#1  0x00007f5836e49e94 in usleep (useconds=<optimized out>) at ../sysdeps/posix/usleep.c:32
#2  0x000055c885f6e9b4 in TSpinWait::Sleep (this=0x7f5822db5058) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/util/system/spin_wait.cpp:36
#3  0x000055c8895f34d8 in TAdaptiveLock::Acquire (this=0x41e8ce94848) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/util/system/spinlock.h:116
#4  TCommonLockOps<TAdaptiveLock>::Acquire (t=0x41e8ce94848) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/util/system/guard.h:9
#5  TGuard<TAdaptiveLock, TCommonLockOps<TAdaptiveLock> >::Init (t=0x41e8ce94848, this=<optimized out>) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/util/system/guard.h:81
#6  TGuard<TAdaptiveLock, TCommonLockOps<TAdaptiveLock> >::TGuard (t=0x41e8ce94848, this=<optimized out>) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/util/system/guard.h:46
#7  Guard<TAdaptiveLock> (t=...) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/util/system/guard.h:96
#8  NYdb::NPersQueue::TSingleClusterReadSessionImpl<false>::OnDecompressionInfoDestroy (this=0x41e8ce944a0, compressedSize=0, decompressedSize=5966, messagesCount=<optimized out>, serverBytesSize=0) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/public/sdk/cpp/client/ydb_persqueue_core/impl/read_session.ipp:1507
#9  0x000055c8895f3138 in NYdb::NPersQueue::TDataDecompressionInfo<false>::~TDataDecompressionInfo (this=0x41e94ee5c98) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/public/sdk/cpp/client/ydb_persqueue_core/impl/read_session.ipp:2247
#10 0x000055c8895dddcd in std::__y1::__shared_count::__release_shared[abi:ue170006]() (this=0x41e94ee5c80) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/__memory/shared_ptr.h:185
#11 std::__y1::__shared_weak_count::__release_shared[abi:ue170006]() (this=0x41e94ee5c80) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/__memory/shared_ptr.h:239
#12 std::__y1::shared_ptr<NYdb::NPersQueue::TDataDecompressionInfo<false> >::~shared_ptr[abi:ue170006]() (this=0x10) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/__memory/shared_ptr.h:798
#13 NYdb::NPersQueue::TDataDecompressionEvent<false>::~TDataDecompressionEvent (this=0x0) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/public/sdk/cpp/client/ydb_persqueue_core/impl/read_session.h:357
#14 std::__y1::__variant_detail::__alt<0ul, NYdb::NPersQueue::TDataDecompressionEvent<false> >::~__alt (this=0x0) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/variant:739
#15 _ZZNSt4__y116__variant_detail6__dtorINS0_8__traitsIJN4NYdb10NPersQueue23TDataDecompressionEventILb0EEENS_7variantIJNS3_6NTopic17TReadSessionEvent18TDataReceivedEventENS9_33TCommitOffsetAcknowledgementEventENS9_27TStartPartitionSessionEventENS9_26TStopPartitionSessionEventENS9_28TPartitionSessionStatusEventENS9_28TPartitionSessionClosedEventENS8_19TSessionClosedEventEEEEEEELNS0_6_TraitE1EE9__destroyB8ue170006EvENKUlRT_E_clINS0_5__altILm0ES6_EEEEDaSM_ (__alt=..., this=<optimized out>) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/variant:878
#16 _ZNSt4__y18__invokeB8ue170006IZNS_16__variant_detail6__dtorINS1_8__traitsIJN4NYdb10NPersQueue23TDataDecompressionEventILb0EEENS_7variantIJNS4_6NTopic17TReadSessionEvent18TDataReceivedEventENSA_33TCommitOffsetAcknowledgementEventENSA_27TStartPartitionSessionEventENSA_26TStopPartitionSessionEventENSA_28TPartitionSessionStatusEventENSA_28TPartitionSessionClosedEventENS9_19TSessionClosedEventEEEEEEELNS1_6_TraitE1EE9__destroyB8ue170006EvEUlRT_E_JRNS1_5__altILm0ES7_EEEEEDTclclsr3stdE7declvalISM_EEspclsr3stdE7declvalIT0_EEEEOSM_DpOSS_ (__args=..., __f=<optimized out>) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/__type_traits/invoke.h:340
#17 _ZNSt4__y116__variant_detail12__visitation6__base12__dispatcherIJLm0EEE10__dispatchB8ue170006IOZNS0_6__dtorINS0_8__traitsIJN4NYdb10NPersQueue23TDataDecompressionEventILb0EEENS_7variantIJNS8_6NTopic17TReadSessionEvent18TDataReceivedEventENSE_33TCommitOffsetAcknowledgementEventENSE_27TStartPartitionSessionEventENSE_26TStopPartitionSessionEventENSE_28TPartitionSessionStatusEventENSE_28TPartitionSessionClosedEventENSD_19TSessionClosedEventEEEEEEELNS0_6_TraitE1EE9__destroyB8ue170006EvEUlRT_E_JRNS0_6__baseILSO_1EJSB_SM_EEEEEEDcSQ_DpT0_ (__f=<optimized out>, __vs=...) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/variant:572
#18 0x000055c8895fc5f2 in _ZNSt4__y116__variant_detail12__visitation6__base11__visit_altB8ue170006IZNS0_6__dtorINS0_8__traitsIJN4NYdb10NPersQueue23TDataDecompressionEventILb0EEENS_7variantIJNS6_6NTopic17TReadSessionEvent18TDataReceivedEventENSC_33TCommitOffsetAcknowledgementEventENSC_27TStartPartitionSessionEventENSC_26TStopPartitionSessionEventENSC_28TPartitionSessionStatusEventENSC_28TPartitionSessionClosedEventENSB_19TSessionClosedEventEEEEEEELNS0_6_TraitE1EE9__destroyB8ue170006EvEUlRT_E_JRSN_EEEDcOSO_DpOT0_ (__visitor=<unknown type in /opt/kikimr/bin/kikimr, CU 0x1b618b4c, DIE 0x1b77c842>, __vs=...) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/variant:535
#19 std::__y1::__variant_detail::__dtor<std::__y1::__variant_detail::__traits<NYdb::NPersQueue::TDataDecompressionEvent<false>, std::__y1::variant<NYdb::NTopic::TReadSessionEvent::TDataReceivedEvent, NYdb::NTopic::TReadSessionEvent::TCommitOffsetAcknowledgementEvent, NYdb::NTopic::TReadSessionEvent::TStartPartitionSessionEvent, NYdb::NTopic::TReadSessionEvent::TStopPartitionSessionEvent, NYdb::NTopic::TReadSessionEvent::TPartitionSessionStatusEvent, NYdb::NTopic::TReadSessionEvent::TPartitionSessionClosedEvent, NYdb::NTopic::TSessionClosedEvent> >, (std::__y1::__variant_detail::_Trait)1>::__destroy[abi:ue170006]() (this=0x0) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/variant:878
#20 std::__y1::__variant_detail::__dtor<std::__y1::__variant_detail::__traits<NYdb::NPersQueue::TDataDecompressionEvent<false>, std::__y1::variant<NYdb::NTopic::TReadSessionEvent::TDataReceivedEvent, NYdb::NTopic::TReadSessionEvent::TCommitOffsetAcknowledgementEvent, NYdb::NTopic::TReadSessionEvent::TStartPartitionSessionEvent, NYdb::NTopic::TReadSessionEvent::TStopPartitionSessionEvent, NYdb::NTopic::TReadSessionEvent::TPartitionSessionStatusEvent, NYdb::NTopic::TReadSessionEvent::TPartitionSessionClosedEvent, NYdb::NTopic::TSessionClosedEvent> >, (std::__y1::__variant_detail::_Trait)1>::~__dtor (this=0x0) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/variant:878
#21 std::__y1::variant<NYdb::NPersQueue::TDataDecompressionEvent<false>, std::__y1::variant<NYdb::NTopic::TReadSessionEvent::TDataReceivedEvent, NYdb::NTopic::TReadSessionEvent::TCommitOffsetAcknowledgementEvent, NYdb::NTopic::TReadSessionEvent::TStartPartitionSessionEvent, NYdb::NTopic::TReadSessionEvent::TStopPartitionSessionEvent, NYdb::NTopic::TReadSessionEvent::TPartitionSessionStatusEvent, NYdb::NTopic::TReadSessionEvent::TPartitionSessionClosedEvent, NYdb::NTopic::TSessionClosedEvent> >::~variant[abi:ue170006]() (this=0x41e9eacd980) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/variant:1399
#22 NYdb::NPersQueue::TRawPartitionStreamEvent<false>::~TRawPartitionStreamEvent (this=0x41e9eacd980) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/public/sdk/cpp/client/ydb_persqueue_core/impl/read_session.h:440
#23 std::__y1::allocator<NYdb::NPersQueue::TRawPartitionStreamEvent<false> >::destroy[abi:ue170006](NYdb::NPersQueue::TRawPartitionStreamEvent<false>*) (this=0x41eb230bd58, __p=0x41e9eacd980) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/__memory/allocator.h:172
#24 std::__y1::allocator_traits<std::__y1::allocator<NYdb::NPersQueue::TRawPartitionStreamEvent<false> > >::destroy[abi:ue170006]<NYdb::NPersQueue::TRawPartitionStreamEvent<false>, void>(std::__y1::allocator<NYdb::NPersQueue::TRawPartitionStreamEvent<false> >&, NYdb::NPersQueue::TRawPartitionStreamEvent<false>*) (__a=..., __p=0x41e9eacd980) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/__memory/allocator_traits.h:315
#25 std::__y1::deque<NYdb::NPersQueue::TRawPartitionStreamEvent<false>, std::__y1::allocator<NYdb::NPersQueue::TRawPartitionStreamEvent<false> > >::clear (this=0x41eb230bd30) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/deque:2868
#26 NYdb::NPersQueue::TRawPartitionStreamEventQueue<false>::clear (this=0x41eb230bd30) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/public/sdk/cpp/client/ydb_persqueue_core/impl/read_session.h:533
#27 0x000055c8895cb485 in NYdb::NPersQueue::TPartitionStreamImpl<false>::ClearQueue (this=<optimized out>) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/public/sdk/cpp/client/ydb_persqueue_core/impl/read_session.h:739
#28 NYdb::NPersQueue::TReadSessionEventsQueue<false>::Close (this=0x41eb6ac4378, event=..., deferred=...) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/public/sdk/cpp/client/ydb_persqueue_core/impl/read_session.h:806
#29 0x000055c8895ccb1b in NYdb::NTopic::TReadSession::AbortImpl(NYdb::NTopic::TSessionClosedEvent&&, NYdb::NPersQueue::TDeferredActions<false>&) (this=this@entry=0x41ef53bfc20, closeEvent=closeEvent@entry=<unknown type in /opt/kikimr/bin/kikimr, CU 0x1b618b4c, DIE 0x1b752f02>, deferred=...) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/public/sdk/cpp/client/ydb_topic/impl/read_session.cpp:367
#30 0x000055c8895cccd6 in NYdb::NTopic::TReadSession::AbortImpl(NYdb::EStatus, NYql::TIssues&&, NYdb::NPersQueue::TDeferredActions<false>&) (this=this@entry=0x41ef53bfc20, statusCode=statusCode@entry=NYdb::EStatus::ABORTED, issues=issues@entry=<unknown type in /opt/kikimr/bin/kikimr, CU 0x1b618b4c, DIE 0x1b75126f>, deferred=...) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/public/sdk/cpp/client/ydb_topic/impl/read_session.cpp:374
#31 0x000055c8895c67d0 in NYdb::NTopic::TReadSession::AbortImpl (this=this@entry=0x41ef53bfc20, statusCode=statusCode@entry=NYdb::EStatus::ABORTED, message=..., deferred=...) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/public/sdk/cpp/client/ydb_topic/impl/read_session.cpp:382
#32 0x000055c8895ca0ef in NYdb::NTopic::TReadSession::Close (this=0x41ef53bfc20, timeout=...) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/public/sdk/cpp/client/ydb_topic/impl/read_session.cpp:274
#33 0x000055c89847382a in NYql::NDq::TDqPqReadActor::PassAway (this=0x41e9b08ccc0) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/library/yql/providers/pq/async_io/dq_pq_read_actor.cpp:263
#34 0x000055c89851e9ec in NYql::NDq::TDqComputeActorBase<NYql::NDq::TDqAsyncComputeActor, NYql::NDq::TComputeActorAsyncInputHelperForTaskRunnerActor>::Terminate (this=this@entry=0x41ea9ede800, success=false, issues=...) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/library/yql/dq/actors/compute/dq_compute_actor_impl.h:458
#35 0x000055c89851a7e4 in NYql::NDq::TDqComputeActorBase<NYql::NDq::TDqAsyncComputeActor, NYql::NDq::TComputeActorAsyncInputHelperForTaskRunnerActor>::HandleExecuteBase (this=0x41ea9ede800, ev=...) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/library/yql/dq/actors/compute/dq_compute_actor_impl.h:1127
#36 0x000055c898520700 in NYql::NDq::TDqComputeActorBase<NYql::NDq::TDqAsyncComputeActor, NYql::NDq::TComputeActorAsyncInputHelperForTaskRunnerActor>::StateFuncWrapper<&NYql::NDq::TDqAsyncComputeActor::StateFuncBody> (this=0x7f5822db4ff0, ev=...) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/library/yql/dq/actors/compute/dq_compute_actor_impl.h:225
#37 0x000055c88649b5ad in NActors::TGenericExecutorThread::Execute<NActors::TMailboxTable::THTSwapMailbox> (this=this@entry=0x41ebd611300, mailbox=0x41ebc684280, hint=hint@entry=266, isTailExecution=<optimized out>) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/library/actors/core/executor_thread.cpp:244
#38 0x000055c88649362b in NActors::TGenericExecutorThread::ProcessExecutorPool(NActors::IExecutorPool*)::$_0::operator()(unsigned int, bool) const (this=this@entry=0x7f5822db5e00, activation=activation@entry=266, isTailExecution=false) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/library/actors/core/executor_thread.cpp:425
#39 0x000055c886492fce in NActors::TGenericExecutorThread::ProcessExecutorPool (this=this@entry=0x41ebd611300, pool=<optimized out>) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/library/actors/core/executor_thread.cpp:478
#40 0x000055c886493e78 in NActors::TExecutorThread::ThreadProc (this=0x41ebd611300) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/contrib/ydb/library/actors/core/executor_thread.cpp:504
#41 0x000055c885f72bea in (anonymous namespace)::TPosixThread::ThreadProxy (arg=0x41ebb4f2f10) at /opt/buildagent/work/7e1e6bfab21d19b2/__FUSE/mount_path/util/system/thread.cpp:244
#42 0x00007f58373206ba in start_thread (arg=0x7f5822dec700) at pthread_create.c:333
#43 0x00007f5836e5351d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

@dorooleg Исправил эту проблему в https://github.com/ydb-platform/ydb/pull/3115. Мы не уверены что найденная нами проблема это таже самая (что описана в описании задачи). Мы продолжим наблюдать (будет ли воспроизводиться).

uzhastik commented 7 months ago

есть след гипотеза: в pq sdk был дедлок, который приводил к залипанию всего пула потоков. поэтому мы не знаем залипает ли пул без этого дедлока. выкатим фикс дедлока и будем ждать рецидива