dataegret / pgcompacttable

BSD 3-Clause "New" or "Revised" License
314 stars 48 forks source link

Configurable `MAX_PAGES_PER_ROUND` #48

Open damirda opened 1 year ago

damirda commented 1 year ago

I changed MAX_PAGES_PER_ROUND in the code to 50, to speed things up, and didn't notice any downsides. Is there any reason to hardcode it to 5 in the first place and not make is configurable through, let's say command line parameter?

alexius2 commented 1 year ago

Hello, I agree that at least it should be configurable. I've made a test few months ago on really big table with mostly empty pages on the tail and got following results (each tested for 5 minutes):

MAX_PAGES_PER_ROUND     speedup
5               1 (baseline)
20              1.49
50              1.77
200             1.88
500             1.39

So optimum for that case was around 200. Not sure about changing default value though, it depends on rows density on pages, concurrent updates on the table and overall server performance. For some bad cases MAX_PAGES_PER_ROUND=5 might be reasonable. It would be great if pgcompacttable changed that value dynamically depending on query execution time with something like moving average formula, maximizing number of processed pages/s and limiting maximum query execution time (to avoid locks).