StarRocks / starrocks

StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries.
https://starrocks.io
Apache License 2.0
8.63k stars 1.74k forks source link

[Crash] BE/CN crash due to obs parallel filesystem with duplicated result in listObjects #48451

Closed gohalo closed 1 month ago

gohalo commented 1 month ago

3.1.13 share-data mode

F0715 03:28:20.325001 1906116 vacuum.cpp:555] Duplicate files were returned from the remote storage. The most likely cause is an S3 or HDFS API compatibility issue with your remote storage implementation. duplicate file: staros://1969195/meta/00000000001E0C2B_000000000000118D.meta
*** Check failure stack trace: ***
3.1.13 RELEASE (build d9d3ed7)
query_id:00000000-0000-0000-0000-000000000000, fragment_instance:00000000-0000-0000-0000-000000000000
tracker:process consumption: 27205388304
tracker:query_pool consumption: 0
tracker:load consumption: 0
tracker:metadata consumption: 552161137
tracker:tablet_metadata consumption: 148878
tracker:rowset_metadata consumption: 0
tracker:segment_metadata consumption: 72103923
tracker:column_metadata consumption: 479908336
tracker:tablet_schema consumption: 148878
tracker:segment_zonemap consumption: 56819505
tracker:short_key_index consumption: 1107737
tracker:column_zonemap_index consumption: 87518368
tracker:ordinal_index consumption: 179983192
tracker:bitmap_index consumption: 0
tracker:bloom_filter_index consumption: 0
tracker:compaction consumption: 0
tracker:schema_change consumption: 0
tracker:column_pool consumption: 0
tracker:page_cache consumption: 23726371904
tracker:update consumption: 0
tracker:chunk_allocator consumption: 61488
tracker:clone consumption: 0
tracker:consistency consumption: 0
tracker:datacache consumption: 0
tracker:replication consumption: 0
*** Aborted at 1720985300 (unix time) try "date -d @1720985300" if you are using GNU date ***
PC: @     0x7f82783e124f (unknown)
*** SIGABRT (@0x7d000220f0e) received by PID 2232078 (TID 0x7f7e66dfe640) from PID 2232078; stack trace: ***
    @          0x653d562 google::(anonymous namespace)::FailureSignalHandler()
    @     0x7f82783951f0 (unknown)
    @     0x7f82783e124f (unknown)
    @     0x7f8278395146 raise
    @     0x7f82783804f7 abort
    @          0x2a6487e starrocks::failure_function()
    @          0x6530f3d google::LogMessage::Fail()
    @          0x65333af google::LogMessage::SendToLog()
    @          0x6530a8e google::LogMessage::Flush()
    @          0x65339b9 google::LogMessageFatal::~LogMessageFatal()
    @          0x50a3439 _ZNSt17_Function_handlerIFbSt17basic_string_viewIcSt11char_traitsIcEEEZN9starrocks4lake19delete_tablets_implEPNS6_13TabletManagerERKNSt7__cxx1112basic_stringIcS2_SaIcEEERKSt6vectorIlSaIlEEEUlS3_E0_E9_M_invokeERKSt9_Any_dataOS3_
    @          0x4cf5276 _ZNSt17_Function_handlerIFbN6staros7starlet5fslib9EntryStatEEZN9starrocks17StarletFileSystem11iterate_dirERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERKSt8functionIFbSt17basic_string_viewIcSA_EEEEUlS3_E_E9_M_invokeERKSt9_Any_dataOS3_
    @          0x5cec1c5 _ZN6staros7starlet5fslib12S3FileSystem8list_dirESt17basic_string_viewIcSt11char_traitsIcEES6_S6_bSt10shared_ptrIN3Aws2S38S3ClientEESt8functionIFbNS1_9EntryStatEEE.localalias
    @          0x5cec713 staros::starlet::fslib::S3FileSystem::list_dir()
    @          0x5cfdfcf staros::starlet::fslib::CacheFileSystemImpl::list_dir()
    @          0x5cfc765 staros::starlet::fslib::CacheFileSystem::list_dir()
    @          0x4cf9ea2 starrocks::StarletFileSystem::iterate_dir()
    @          0x509fbbf starrocks::lake::delete_tablets_impl()
    @          0x50a1227 starrocks::lake::delete_tablets()
    @          0x2c653ce _ZNSt17_Function_handlerIFvvEZN9starrocks15LakeServiceImpl13delete_tabletEPN6google8protobuf13RpcControllerEPKNS1_4lake19DeleteTabletRequestEPNS7_20DeleteTabletResponseEPNS4_7ClosureEEUlvE_E9_M_invokeERKSt9_Any_data
    @          0x2d0d64c starrocks::ThreadPool::dispatch_thread()
    @          0x2d072fa starrocks::Thread::supervise_thread()
    @     0x7f82783df67a (unknown)
    @     0x7f82784625e0 (unknown)
    @                0x0 (unknown)
kevincai commented 1 month ago

Duplicate files were returned from the remote storage. The most likely cause is an S3 or HDFS API compatibility issue with your remote storage implementation. duplicate file:

Caused by the incompatible API to S3 api by huawei OBS parallel file system.

kevincai commented 1 month ago

ROOT CAUSE:

When server side processing the v2 listing request as v1 listing request, and in case of need pagination, the pagination mark is different from v1 and v2, which causes the client side missing interpreting the pagination mark and continuing the list from beginning again, and never ending. In order to break this endless loop, a duplication check is added to detect the problem and crash the BE/CN loudly because manual interfere is required for such case.

TEMP REMEDIATION

  1. Find the error path from the log
  2. Drop the entire table for partition forcibly
  3. manual clean the data to the path on obs console
  4. start BE/CN again
gohalo commented 1 month ago

ROOT CAUSE:

  • AWS CPP SDK is used to access all S3-compatible object storage. It is the listObjectsV2 API by default to list objects on remote service.
  • OBS parallel filesystem implements listObjectsV1 API, not the listObjectsV2 API and not reporting error when the client issues a listObjectV2 API to an listObjectsV1 endpoint.

When server side processing the v2 listing request as v1 listing request, and in case of need pagination, the pagination mark is different from v1 and v2, which causes the client side missing interpreting the pagination mark and continuing the list from beginning again, and never ending. In order to break this endless loop, a duplication check is added to detect the problem and crash the BE/CN loudly because manual interfere is required for such case.

TEMP REMEDIATION

  1. Find the error path from the log
  2. Drop the entire table for partition forcibly
  3. manual clean the data to the path on obs console
  4. start BE/CN again

so that means the number of files under some obs dirs, should not larger than pagination size, maybe 1000?

kevincai commented 1 month ago

yes

kevincai commented 1 month ago

partially fixed in https://github.com/StarRocks/starrocks/pull/48949

kevincai commented 1 month ago

branch-3.1 related fix in https://github.com/StarRocks/starrocks/pull/49103