couchbaselabs / cbft

*THIS PROJECT HAS MOVED* from couchbaselabs TO: https://github.com/couchbase/cbft -- no further development will be done here on couchbaselabs/cbft
Other
27 stars 5 forks source link

Index ingest batch sizes should be configurable #14

Open steveyen opened 9 years ago

steveyen commented 9 years ago

In the code today, there's...

const BLEVE_DEST_INITIAL_BUF_SIZE_BYTES = 20000
const BLEVE_DEST_APPLY_BUF_SIZE_BYTES = 200000

Either we should have some configurability, or (fancier, not sure if better) maybe some self-tuning: start off favoring initial data loading, then it automagically works it way down to favoring ingest latency... and, then back up again once there's more mass data loading.

Link to related issue: https://github.com/couchbaselabs/cbft/issues/11 but have configurability as a separate issue here just in case batch sizes turn out somehow to be not the issue.

steveyen commented 9 years ago

looked at the code recently...

and it seems like due to DCP semantics, where there's a notion of a snapshot, either the entire snapshot must be incorporated or not. so, there's no separate controllable batch size anymore.

steveyen commented 9 years ago

spoke with Marty and his better idea is to reopen this as de-linking batch ingest sizes from DCP snapshot sizes can be important for performance for some backend KV stores