Open tjungblu opened 4 days ago
High level makes sense. Eventually we should consider to deprecate the Array
free list.
Admittedly, this interface was too idealistic for such an old codebase :) Turns out we're leaking implementation details everywhere and for some optimization also have introduced bad coupling. The tests are especially fun to decouple :1st_place_medal:
Also the coupling of the transaction to the pages should not go into the freelist (imho) and we should have another mapping in the Tx that contains what pages have been allocated for it.
Below seems like a functional interface, but I think we would need to cut this down a bit further to be practical.
type Freelist interface {
// Init initializes this freelist with the given list of pages.
Init(ids common.Pgids)
// Allocate returns the starting page id of a contiguous block of a given size in number of pages.
// If a contiguous block cannot be found then 0 is returned.
Allocate(txid common.Txid, numPages int) common.Pgid
// Count returns the number of free pages.
Count() int
// PendingCount returns the number of pending pages.
// TODO(thomas): implementation detail?
PendingCount() int
// FreePageIds returns all free pages.
FreePageIds() common.Pgids
// Release moves all page ids for a transaction id (or older) to the freelist.
Release(txid common.Txid)
// ReleaseRange moves pending pages allocated within an extent [begin,end] to the free list.
ReleaseRange(begin, end common.Txid)
// Free releases a page and its overflow for a given transaction id.
// If the page is already free then a panic will occur.
Free(txid common.Txid, p *common.Page)
// Rollback removes the pages from a given pending tx.
Rollback(txid common.Txid)
// Freed returns whether a given page is in the free list.
Freed(pgId common.Pgid) bool
// Reload reads the freelist from a page and filters out pending items.
// TODO(thomas): could be just Read?
Reload(p *common.Page)
// NoSyncReload reads the freelist from Pgids and filters out pending items.
// TODO(thomas): could be just Init?
NoSyncReload(Pgids []common.Pgid)
/*
* SerDe Section
*/
// Read calls Init with the page ids stored in te given page.
Read(page *common.Page)
// EstimatedWritePageSize returns the size of the page after serialization in Write.
// TODO(thomas): implementation detail?
EstimatedWritePageSize() int
// Write writes the freelist into the given page.
Write(page *common.Page)
}
source (not everything is implemented/updated yet): https://github.com/tjungblu/bbolt/commit/d3831cbb52476bf939a4cd1d95f11280ca03b08f
Do you want to continue through a PR? Happy to adjust and refactor the remaining pieces.
opened #775, let's focus on the interface first. I think we also need @ivanvc's benchmark for it :)
I suggest to move all array related methods into a separate file freelist_array.go
as the very first step, so as to simplify the freelist.go
. Please feel free to deliver a PR for that, thx.
After that we can continue to discuss how to refactor the freelist management in your PR https://github.com/etcd-io/bbolt/pull/775.
Also as your pointed out above, we need the benchmark (https://github.com/etcd-io/bbolt/pull/750) workflow check ready beforehand. @ivanvc can you take that as a higher priority task? thx
there you go @ahrtr https://github.com/etcd-io/bbolt/pull/777
@ivanvc can you take that as a higher priority task? thx
Yes, I was able to work on it yesterday. I'll try to have something by the EOD.
bbolt features two types of freelist: "array" and "hashmap". They are currently implemented in one monolithic struct, with their differences in functions separated into two go files.
Testing has been a pain so far, with many of the freelist details (i.e.releasing pages and serde to disk) leaking outside of the freelist struct. This makes reasoning about the implementations very difficult, too. Adding a new type of list is becoming increasingly cumbersome without a proper interface.
Quickly pulling out the existing functions into an interface could look like this:
Ideally we would move this into its own internal package in order to avoid leaking any more internals in the future. HashMap and Array should be two implementation as its own struct. Some shared functions (ie read/write) could be implemented in a common struct for now. Even though I expect those to become implementation dependent at some point, it seems quite inefficient to write low-digit uint64 integers without vint compression (or as a bitset), for example.