Project: TileGroup Compaction, TileGroup Freeing, and Garbage Collection Fixes
Our project has 3 main aspects:
Enabling the Garbage Collector to free empty tile groups and reclaim their memory (Done)
Merging sparsely occupied tile groups to save memory (Done)
Fixing several important bugs in the Garbage Collector and Transaction Manager (Done)
Status
Completed 75% 100% and 125% goals
Summary
1) TileGroup Freeing
This PR enhances the Garbage Collector to free empty TileGroups when all of their tuple slots have been recycled. We also scanned the codebase for all points (outside of codegen) that fetch tile groups from the catalog without checking for a nullptr. TileGroups can no longer be assumed to live forever, so these checks must be done. We need to follow up with Prashanth to do a similar check in the codegen engine. It also created a new class, RecycleStack to replace the Garbage Collector’s recycle queues. This concurrent data structure offers a non-blocking, constant-time TryPop() to retrieve a recycled tuple slot, a blocking, constant-time Push(), and a concurrent RemoveAllWithKey() function that uses hand-over-hand locking. The performance of TryPop() is very important because TryPop() lies on the critical path for DataTable::Insert(). We had to write a new data structure instead of using the existing concurrent queue because we need to be able to iterate through and remove recycled slots from TileGroups that become immutable. None of the existing concurrent structures support concurrent iteration, insertion, and removal.
2) TileGroup Compaction
This PR adds a class called TileGroupCompactor that performs compaction of tile groups. Compaction is triggered by the GarbageCollector, which submits a CompactTileGroup() task to the MonoQueuePool when the majority of a TileGroup is recycled garbage. The exact fraction of garbage required to trigger compaction is determined by a global setting, settings:SettingId::compaction_threshold. Compaction can also be enabled/disabled via another setting, global setting settings:SettingId::compaction_enabled.
3) Garbage Collection Fixes
This PR resolves almost all of the issues identified in issue #1325.
It includes the results of a thorough correctness audit we performed on Peleton's garbage collection system. It includes a whole new test suite for the Transaction-Level Garbage Collector and several important bug fixes to the Garbage Collector and Transaction Manager.
GC Fixes Summary:
Added 14 tests to transaction_level_gc_manager_test.cpp to handle more complex GC scenarios.
Currently 4 of these tests still fail for polluting indexes with old keys, but we believe this will require more significant changes at the execution layer to resolve.
We believe we have resolved all of the tuple-level GC bugs and most of the index bugs.
We have disabled the 4 check that fail and will open a new issue describing those scenarios only.
GCManager::RecycleTupleSlot allows unused ItemPointers to be returned without going through the entire Unlink and Reclaim process.
Modified TOTransactionManager to pass tombstones created by deletes to the GCManager.
*Modified DataTable's Insert to return the ItemPointer to the GCManager in the case of a failed insert.
Modified DataTable's InsertIntoIndexes to iterate through indexes and remove inserted keys in the event of a failure.
Modified GCManager's Unlink function to clean indexes from garbage created by COMMIT_DELETE, COMMIT_UPDATE, and ABORT_UPDATE.
Coverage increased (+0.4%) to 77.489% when pulling 002881b9926d633b80347fbe602352f10d358486 on mbutrovich:gc_fixes into 4d6182646bb36d13c87e7941fff29c4c0617ba77 on cmu-db:master.
Project: TileGroup Compaction, TileGroup Freeing, and Garbage Collection Fixes
Our project has 3 main aspects:
Status
Completed 75% 100% and 125% goals
Summary
1) TileGroup Freeing This PR enhances the Garbage Collector to free empty TileGroups when all of their tuple slots have been recycled. We also scanned the codebase for all points (outside of codegen) that fetch tile groups from the catalog without checking for a nullptr. TileGroups can no longer be assumed to live forever, so these checks must be done. We need to follow up with Prashanth to do a similar check in the codegen engine. It also created a new class, RecycleStack to replace the Garbage Collector’s recycle queues. This concurrent data structure offers a non-blocking, constant-time TryPop() to retrieve a recycled tuple slot, a blocking, constant-time Push(), and a concurrent RemoveAllWithKey() function that uses hand-over-hand locking. The performance of TryPop() is very important because TryPop() lies on the critical path for DataTable::Insert(). We had to write a new data structure instead of using the existing concurrent queue because we need to be able to iterate through and remove recycled slots from TileGroups that become immutable. None of the existing concurrent structures support concurrent iteration, insertion, and removal.
2) TileGroup Compaction This PR adds a class called TileGroupCompactor that performs compaction of tile groups. Compaction is triggered by the GarbageCollector, which submits a CompactTileGroup() task to the MonoQueuePool when the majority of a TileGroup is recycled garbage. The exact fraction of garbage required to trigger compaction is determined by a global setting, settings:SettingId::compaction_threshold. Compaction can also be enabled/disabled via another setting, global setting settings:SettingId::compaction_enabled.
3) Garbage Collection Fixes This PR resolves almost all of the issues identified in issue #1325. It includes the results of a thorough correctness audit we performed on Peleton's garbage collection system. It includes a whole new test suite for the Transaction-Level Garbage Collector and several important bug fixes to the Garbage Collector and Transaction Manager.
GC Fixes Summary: