Closed klauck closed 4 months ago
thanks a lot!
could you provide some figures about the changes, like what are the previous and new size for a few hypothetical index scenarios, and how is evolving the bloat factor for 90% fillfactor and a few other reasonable values?
For the default fillfactor=90
, the bloat factor is similar for the new and old estimation.
For lower fill factors (e.g., fillfactor=50
), the bloat factors are significantly higher with the new estimation.
Regarding the overall estimated size, another change proposal is relevant:
line_size = ind_avg_width +
+(sizeof(IndexTupleData) * entry->ncolumns)
+ MAXALIGN(sizeof(ItemIdData) * entry->ncolumns);
-->
line_size = ind_avg_width + MAXALIGN(sizeof(IndexTupleData)) + sizeof(ItemIdData);
The new estimated line size is usually smaller (which is compensated by the larger bloat factor). Note, the larger the estimated line size, the larger becomes the compensation, because the line size and bloat factor are multiplied. I.e., for small line sizes, the old estimation is larger, but for large line sizes, the new estimation is larger. Nevertheless, the old and new estimated index sizes are usually similar. But it would be possible to construct cases in which they diverge.
Thanks a lot, the various info and graphs are really useful!
It looks good to me, I'm merging the PR. Thanks again for your work!
Thank you, too. I like your project and am happy to contribute!
This pull request primarily adds a better documentation (i.e., adds comments) for the index size estimation, based on the discussion for pull request #64 with @rjuju and @lfittl.
It also (slightly) adjusts the size estimation:
additional_bloat
20 -> 30 (to account for the other adjustments)IndexTupleData
andItemIdData
are added only once per index entryIndexTupleData
and not forItemIdData
bloat_factor
is calculated as documented based onfillfactor
andadditional_bloat
Fix some typos in hypopg_index.c