vmware-archive / quickstep

Quickstep Project
Apache License 2.0
27 stars 13 forks source link

Increase the default storage block size to 32MB. #200

Closed pateljm closed 8 years ago

pateljm commented 8 years ago

The downside is some internal fragmentation for small files, and potentially reduced concurrency as allocations for each thread is larger now. But, query performance goes up by a huge amount for large data sets.

zuyu commented 8 years ago

@pateljm Please take a look at the test cases I've fixed:

zuyu commented 8 years ago

@pateljm I'll figure out the CI build issue due to a long running HashTable unit test (more than 10 min).

cramja commented 8 years ago

I would suggest allowing a 2-4MB minimum blocksize for the reason @pateljm mentioned. Even if the default was 32MB, we should allow the user to specify something smaller, especially because some tables will be tiny.

@zuyu I think the storage constants you set in ef4c1cc made sense. That is,

const std::size_t kSlotSizeBytes = 0x400000; // 4MB 
const std::uint64_t kBlockSizeLowerBoundBytes = kSlotSizeBytes;
// The default size of a new relation in terms of the number of slots.
const std::uint64_t kDefaultBlockSizeInSlots = 8; // 32MB

We have to be careful to keep kSlotSizeBytes small so that allocations which don't need much space do not fragment and waste memory.

Though to prevent fragmentation, we probably want a 2MB or smaller slot size:

const std::size_t kSlotSizeBytes = 0x200000; 
cramja commented 8 years ago

@zuyu Noting that the reason your unit tests are crashing is likely because the slot size is set to 32MB right now.

zuyu commented 8 years ago

@cramja I know, it just takes more than 10 min to finish the test, including resizing a large block (32 MB).

pateljm commented 8 years ago

Ok -- let's drop this PR.