The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance for multi-dimensional analytics, real-time analytics, and ad-hoc queries. A Linux Foundation project.
In previous PR, we introduced MetadataCache to achieve LRU strategy for rowset metadata memory manage, and when rowset been evict, then we will call this function:
But in some case, this rowset which pointer point to could has been destroy, but _cache_value_deleter doesn't know that and will still keep calling close(), and it will lead to concurrency issue.
What I'm doing:
This pull request includes several changes to the MetadataCache class and related functionality in the rowset module. The main goals are to improve memory management by using std::weak_ptr for cached rowsets and to rename the warmup_rowset method to refresh_rowset for clarity. Additionally, a new concurrency test is added to ensure thread safety.
Memory management improvements:
be/src/storage/rowset/metadata_cache.cpp: Changed MetadataCache::cache_rowset to use std::weak_ptr<Rowset> for better memory management and updated the _insert method accordingly. [1][2]
be/src/storage/rowset/metadata_cache.cpp and be/src/storage/rowset/metadata_cache.h: Renamed warmup_rowset to refresh_rowset to better reflect its purpose. [1][2][3]
[ ] Yes, this PR will result in a change in behavior.
[x] No, this PR will not result in a change in behavior.
If yes, please specify the type of change:
[ ] Interface/UI changes: syntax, type conversion, expression evaluation, display information
[ ] Parameter changes: default values, similar parameters but with different default values
[ ] Policy changes: use new policy to replace old one, functionality automatically enabled
[ ] Feature removed
[ ] Miscellaneous: upgrade & downgrade compatibility, etc.
Checklist:
[x] I have added test cases for my bug fix or my new feature
[ ] This pr needs user documentation (for new or modified features or behaviors)
[ ] I have added documentation for my new feature or new function
[x] This is a backport pr
Bugfix cherry-pick branch check:
[x] I have checked the version labels which the pr will be auto-backported to the target branch
[x] 3.3
[x] 3.2
[ ] 3.1
[ ] 3.0
[ ] 2.5
This is an automatic backport of pull request #52968 done by Mergify.
Why I'm doing:
In previous PR, we introduced MetadataCache to achieve LRU strategy for rowset metadata memory manage, and when rowset been evict, then we will call this function:
But in some case, this rowset which pointer point to could has been destroy, but _cache_value_deleter doesn't know that and will still keep calling close(), and it will lead to concurrency issue.
What I'm doing:
This pull request includes several changes to the MetadataCache class and related functionality in the rowset module. The main goals are to improve memory management by using std::weak_ptr for cached rowsets and to rename the warmup_rowset method to refresh_rowset for clarity. Additionally, a new concurrency test is added to ensure thread safety.
Memory management improvements:
be/src/storage/rowset/metadata_cache.cpp: Changed MetadataCache::cache_rowset to use std::weak_ptr<Rowset> for better memory management and updated the _insert method accordingly. [1][2]
be/src/storage/rowset/metadata_cache.cpp and be/src/storage/rowset/metadata_cache.h: Renamed warmup_rowset to refresh_rowset to better reflect its purpose. [1][2][3]
Why I'm doing:
In previous PR, we introduced
MetadataCache
to achieve LRU strategy for rowset metadata memory manage, and when rowset been evict, then we will call this function:to close rowset and release metadata memory.
But in some case, this rowset which pointer point to could has been destroy, but
_cache_value_deleter
doesn't know that and will still keep callingclose()
, and it will lead to concurrency issue.What I'm doing:
This pull request includes several changes to the
MetadataCache
class and related functionality in therowset
module. The main goals are to improve memory management by usingstd::weak_ptr
for cached rowsets and to rename thewarmup_rowset
method torefresh_rowset
for clarity. Additionally, a new concurrency test is added to ensure thread safety.Memory management improvements:
be/src/storage/rowset/metadata_cache.cpp
: ChangedMetadataCache::cache_rowset
to usestd::weak_ptr<Rowset>
for better memory management and updated the_insert
method accordingly. [1] [2]be/src/storage/rowset/metadata_cache.cpp
: Modified_cache_value_deleter
to handlestd::weak_ptr<Rowset>
and ensure proper cleanup of cached rowsets.Method renaming for clarity:
be/src/storage/rowset/metadata_cache.cpp
andbe/src/storage/rowset/metadata_cache.h
: Renamedwarmup_rowset
torefresh_rowset
to better reflect its purpose. [1] [2] [3]be/src/storage/rowset/rowset.cpp
: Updated calls towarmup_rowset
to use the newrefresh_rowset
method.be/test/storage/rowset/metadata_cache_test.cpp
: Updated test cases to userefresh_rowset
instead ofwarmup_rowset
. [1] [2]New test for concurrency:
be/test/storage/rowset/metadata_cache_test.cpp
: Added a new testtest_concurrency_issue
to verify the thread safety of theMetadataCache
when caching rowsets concurrently.What type of PR is this:
Does this PR entail a change in behavior?
If yes, please specify the type of change:
Checklist:
Bugfix cherry-pick branch check:
This is an automatic backport of pull request #52968 done by Mergify.
Why I'm doing:
In previous PR, we introduced
MetadataCache
to achieve LRU strategy for rowset metadata memory manage, and when rowset been evict, then we will call this function:to close rowset and release metadata memory.
But in some case, this rowset which pointer point to could has been destroy, but
_cache_value_deleter
doesn't know that and will still keep callingclose()
, and it will lead to concurrency issue.What I'm doing:
This pull request includes several changes to the
MetadataCache
class and related functionality in therowset
module. The main goals are to improve memory management by usingstd::weak_ptr
for cached rowsets and to rename thewarmup_rowset
method torefresh_rowset
for clarity. Additionally, a new concurrency test is added to ensure thread safety.Memory management improvements:
be/src/storage/rowset/metadata_cache.cpp
: ChangedMetadataCache::cache_rowset
to usestd::weak_ptr<Rowset>
for better memory management and updated the_insert
method accordingly. [1] [2]be/src/storage/rowset/metadata_cache.cpp
: Modified_cache_value_deleter
to handlestd::weak_ptr<Rowset>
and ensure proper cleanup of cached rowsets.Method renaming for clarity:
be/src/storage/rowset/metadata_cache.cpp
andbe/src/storage/rowset/metadata_cache.h
: Renamedwarmup_rowset
torefresh_rowset
to better reflect its purpose. [1] [2] [3]be/src/storage/rowset/rowset.cpp
: Updated calls towarmup_rowset
to use the newrefresh_rowset
method.be/test/storage/rowset/metadata_cache_test.cpp
: Updated test cases to userefresh_rowset
instead ofwarmup_rowset
. [1] [2]New test for concurrency:
be/test/storage/rowset/metadata_cache_test.cpp
: Added a new testtest_concurrency_issue
to verify the thread safety of theMetadataCache
when caching rowsets concurrently.What type of PR is this:
Does this PR entail a change in behavior?
If yes, please specify the type of change:
Checklist: