Open equals215 opened 3 months ago
@CorentinB throw out everything that comes to mind related to that matter. Any ideas, features, must-have/must-do, warnings. Everything.
Do we want to have an option to disable this? (in order to save some disk I/O when we know the crawl will be short and we don't care about saving some disk space while it runs)
Do we want to have an option to disable this? (in order to save some disk I/O when we know the crawl will be short and we don't care about saving some disk space while it runs)
Fully in memory, gets dumped at the same time as the rest of the index and is tracked using the queue index WAL : you can derive queue index add/pop operations and make them freeSpace index operations.
So yeah we can make that optional but I mean, it's in-memory so no disk I/O related performance issues
The goal of this PR is to drastically slow down the growth of the queue by reusing disk space from popped items. This
freeSpace
index will use a lock-free size-specific slot array aka LSSA for common item sizes (to be determined via existing indexes analysis) and a stratified list for uncommon free space sizes. Also thinking of a defragmentation algorithm ̶a̶n̶d̶ ̶a̶ ̶w̶a̶y̶ ̶t̶o̶ ̶s̶t̶o̶r̶e̶ ̶f̶r̶e̶e̶S̶p̶a̶c̶e̶ ̶i̶n̶d̶e̶x̶ ̶o̶p̶e̶r̶a̶t̶i̶o̶n̶s̶ ̶i̶n̶t̶o̶ ̶t̶h̶e̶ ̶W̶A̶L̶.̶ ̶ <- thefreeSpace
index is derived from index adds and pops...