pingcap / tiflash

The analytical engine for TiDB and TiDB Cloud. Try free: https://tidbcloud.com/free-trial
https://docs.pingcap.com/tidb/stable/tiflash-overview
Apache License 2.0
946 stars 409 forks source link

"DB::Exception: Can not find path for DMFile" when doing manual compaction #4808

Open breezewish opened 2 years ago

breezewish commented 2 years ago

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

Enable PageStorageV3, then,

create table x(pk int, b int) engine = DeltaMerge(pk);
create table y(pk int, b int) engine = DeltaMerge(pk);
create table z(pk int, b int) engine = DeltaMerge(pk);
manage table x merge delta;
manage table y merge delta;  <--- Error raised

2. What did you expect to see? (Required)

No errors

3. What did you see instead (Required)

Code: 0. DB::Exception: Received from 127.0.0.1:9000. DB::Exception: Can not find path for DMFile [id=2].

With the following error log in server:

2022.04.29 16:45:30.022599 [ 7 ] <Error> void DB::AsynchronousMetrics::run(): Code: 0, e.displayText() = DB::Exception: Can not find path for DMFile [id=2], e.what() = DB::Exception, Stack trace:

     0x104af2038    StackTrace::StackTrace() [tiflash+4296024120]
                    dbms/src/Common/StackTrace.cpp:23
     0x104af2074    StackTrace::StackTrace() [tiflash+4296024180]
                    dbms/src/Common/StackTrace.cpp:22
     0x104d068e0    DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int) [tiflash+4298205408]
                    dbms/src/Common/Exception.h:41
     0x104cf63c8    DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int) [tiflash+4298138568]
                    dbms/src/Common/Exception.h:43
     0x10d8cf740    DB::StableDiskDelegator::getDTFilePath(unsigned long long, bool) const [tiflash+4444780352]
                    dbms/src/Storages/PathPool.cpp:403
     0x10dc6f0f4    DB::DM::StableValueSpace::restore(DB::DM::DMContext&, unsigned long long) [tiflash+4448579828]
                    dbms/src/Storages/DeltaMerge/StableValueSpace.cpp:109
     0x10dbf61c8    DB::DM::Segment::restoreSegment(DB::DM::DMContext&, unsigned long long) [tiflash+4448084424]
                    dbms/src/Storages/DeltaMerge/Segment.cpp:275
     0x10db42e98    DB::DM::DeltaMergeStore::DeltaMergeStore(DB::Context&, bool, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, long long, std::__1::vector<DB::DM::ColumnDefine, std::__1::allocator<DB::DM::ColumnDefine> > const&, DB::DM::ColumnDefine const&, bool, unsigned long, DB::DM::DeltaMergeStore::Settings const&) [tiflash+4447350424]
                    dbms/src/Storages/DeltaMerge/DeltaMergeStore.cpp:267
     0x10db44eb4    DB::DM::DeltaMergeStore::DeltaMergeStore(DB::Context&, bool, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, long long, std::__1::vector<DB::DM::ColumnDefine, std::__1::allocator<DB::DM::ColumnDefine> > const&, DB::DM::ColumnDefine const&, bool, unsigned long, DB::DM::DeltaMergeStore::Settings const&) [tiflash+4447358644]
                    dbms/src/Storages/DeltaMerge/DeltaMergeStore.cpp:212
     0x10d9af7c4    std::__1::__shared_ptr_emplace<DB::DM::DeltaMergeStore, std::__1::allocator<DB::DM::DeltaMergeStore> >::__shared_ptr_emplace<DB::Context&, bool const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, long long&, std::__1::vector<DB::DM::ColumnDefine, std::__1::allocator<DB::DM::ColumnDefine> >, DB::DM::ColumnDefine, bool&, unsigned long&, DB::DM::DeltaMergeStore::Settings>(std::__1::allocator<DB::DM::DeltaMergeStore>, DB::Context&, bool const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, long long&, std::__1::vector<DB::DM::ColumnDefine, std::__1::allocator<DB::DM::ColumnDefine> >&&, DB::DM::ColumnDefine&&, bool&, unsigned long&, DB::DM::DeltaMergeStore::Settings&&) [tiflash+4445697988]
                    /Library/Developer/CommandLineTools/SDKs/MacOSX12.1.sdk/usr/include/c++/v1/memory:2627
     0x10d9af648    std::__1::__shared_ptr_emplace<DB::DM::DeltaMergeStore, std::__1::allocator<DB::DM::DeltaMergeStore> >::__shared_ptr_emplace<DB::Context&, bool const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, long long&, std::__1::vector<DB::DM::ColumnDefine, std::__1::allocator<DB::DM::ColumnDefine> >, DB::DM::ColumnDefine, bool&, unsigned long&, DB::DM::DeltaMergeStore::Settings>(std::__1::allocator<DB::DM::DeltaMergeStore>, DB::Context&, bool const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, long long&, std::__1::vector<DB::DM::ColumnDefine, std::__1::allocator<DB::DM::ColumnDefine> >&&, DB::DM::ColumnDefine&&, bool&, unsigned long&, DB::DM::DeltaMergeStore::Settings&&) [tiflash+4445697608]
                    /Library/Developer/CommandLineTools/SDKs/MacOSX12.1.sdk/usr/include/c++/v1/memory:2621
     0x10d9af544    std::__1::shared_ptr<DB::DM::DeltaMergeStore> std::__1::allocate_shared<DB::DM::DeltaMergeStore, std::__1::allocator<DB::DM::DeltaMergeStore>, DB::Context&, bool const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, long long&, std::__1::vector<DB::DM::ColumnDefine, std::__1::allocator<DB::DM::ColumnDefine> >, DB::DM::ColumnDefine, bool&, unsigned long&, DB::DM::DeltaMergeStore::Settings, void>(std::__1::allocator<DB::DM::DeltaMergeStore> const&, DB::Context&, bool const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, long long&, std::__1::vector<DB::DM::ColumnDefine, std::__1::allocator<DB::DM::ColumnDefine> >&&, DB::DM::ColumnDefine&&, bool&, unsigned long&, DB::DM::DeltaMergeStore::Settings&&) [tiflash+4445697348]
                    /Library/Developer/CommandLineTools/SDKs/MacOSX12.1.sdk/usr/include/c++/v1/memory:3385
     0x10d96cb6c    std::__1::shared_ptr<DB::DM::DeltaMergeStore> std::__1::make_shared<DB::DM::DeltaMergeStore, DB::Context&, bool const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, long long&, std::__1::vector<DB::DM::ColumnDefine, std::__1::allocator<DB::DM::ColumnDefine> >, DB::DM::ColumnDefine, bool&, unsigned long&, DB::DM::DeltaMergeStore::Settings, void>(DB::Context&, bool const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, long long&, std::__1::vector<DB::DM::ColumnDefine, std::__1::allocator<DB::DM::ColumnDefine> >&&, DB::DM::ColumnDefine&&, bool&, unsigned long&, DB::DM::DeltaMergeStore::Settings&&) [tiflash+4445424492]
                    /Library/Developer/CommandLineTools/SDKs/MacOSX12.1.sdk/usr/include/c++/v1/memory:3394
     0x10d95e62c    DB::StorageDeltaMerge::getAndMaybeInitStore() [tiflash+4445365804]
                    dbms/src/Storages/StorageDeltaMerge.cpp:1528
     0x10d1eacbc    DB::StorageDeltaMerge::getStore() [tiflash+4437552316]
                    dbms/src/Storages/StorageDeltaMerge.h:137
     0x10d1ea590    DB::AsynchronousMetrics::update() [tiflash+4437550480]
                    dbms/src/Interpreters/AsynchronousMetrics.cpp:165
     0x10d1e9ff4    DB::AsynchronousMetrics::run() [tiflash+4437549044]
                    dbms/src/Interpreters/AsynchronousMetrics.cpp:103
     0x104c31278    DB::AsynchronousMetrics::AsynchronousMetrics(DB::Context&)::'lambda'()::operator()() const [tiflash+4297331320]
                    dbms/src/Interpreters/AsynchronousMetrics.h:38
     0x104c31214    decltype(std::__1::forward<DB::AsynchronousMetrics::AsynchronousMetrics(DB::Context&)::'lambda'()>(fp)()) std::__1::__invoke<DB::AsynchronousMetrics::AsynchronousMetrics(DB::Context&)::'lambda'()>(DB::AsynchronousMetrics::AsynchronousMetrics(DB::Context&)::'lambda'()&&) [tiflash+4297331220]
                    /Library/Developer/CommandLineTools/SDKs/MacOSX12.1.sdk/usr/include/c++/v1/type_traits:3694
     0x104c311ac    void std::__1::__thread_execute<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, DB::AsynchronousMetrics::AsynchronousMetrics(DB::Context&)::'lambda'()>(std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, DB::AsynchronousMetrics::AsynchronousMetrics(DB::Context&)::'lambda'()>&, std::__1::__tuple_indices<>) [tiflash+4297331116]
                    /Library/Developer/CommandLineTools/SDKs/MacOSX12.1.sdk/usr/include/c++/v1/thread:286
     0x104c309ac    void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, DB::AsynchronousMetrics::AsynchronousMetrics(DB::Context&)::'lambda'()> >(void*) [tiflash+4297329068]
                    /Library/Developer/CommandLineTools/SDKs/MacOSX12.1.sdk/usr/include/c++/v1/thread:297
     0x1c7691240    __pthread_deallocate [libsystem_pthread.dylib+6445650496]
     0x1c768c024    _pthread_key_init_np [libsystem_pthread.dylib+6445629476]

4. What is your TiFlash version? (Required)

abb313953973dd955daad63724a2cf22bb94e264

(April 22, 2022 at 11:58:04 GMT+8)

jiaqizho commented 2 years ago

When we enable PageStorage v3. We will use the namespace id to distinguish different tables.

create table x(pk int, b int) engine = DeltaMerge(pk);
create table y(pk int, b int) engine = DeltaMerge(pk);
create table z(pk int, b int) engine = DeltaMerge(pk);

After create table x,y,z, they got the same namespace id(TEST_NAMESPACE_ID 1000)

manage table x merge delta;

After table x delta merged, we got 2 dtfile in namespace id which is 1000. Then call the table y, it will restore the dtfile and it won't find dtfile which id is 2.

So that is a problem in tiflash client. We should not use TEST_NAMESPACE_ID by default.

JaySon-Huang commented 2 years ago

@breezewish You're creating tables through ch-client, which is a legacy way that works as using ClickHouse client to create a table in the ClilckHouse server. We only remain the ch-client as a debug way and have no official support on it.

As a similar problem was raised in the CI tests, we allocated different IDs for those tables created through ch-client in this PR https://github.com/pingcap/tiflash/pull/4831. But there still are some known limitations:

BTW, for debugging purposes, we normally use DBGInvoke mock_tidb_table(...) to create tables for mock tests like this: https://github.com/pingcap/tiflash/blob/4ad156b924c88233bb7d9da43f2f47d249a16cdb/tests/delta-merge-test/raft/bugs/FLASH-484.test#L24-L32

breezewish commented 2 years ago

@JaySon-Huang Thanks for the supply information! It is really helpful.

I also discovered that there are some legacy tests using CREATE TABLE way, e.g. https://github.com/pingcap/tiflash/blob/4ad156b924c88233bb7d9da43f2f47d249a16cdb/tests/delta-merge-test/ddl/alter.test#L18-L22

Maybe we need to migrate these tests, or make CREATE TABLE work without problems?

JaySon-Huang commented 2 years ago

Yes, I think we can rewrite those legacy tests as fullstack-test / unit test cases https://github.com/pingcap/tiflash/pull/4831/commits/27010a6ec14f263a0f9a2c4755adedde31054bdd