Open abhi-bit opened 9 years ago
After I flipped to using goleveldb, index size on disk has dropped to 2x - 3x bucket mem_used. Also with goleveldb indexing is very noticeably faster compared to default boltdb option. It might make sense to have goleveldb as default kvstore(Note: I haven't tested anything else beside boltdb)
OK, I don't see any arrays. How big is the index? Is it possible to share it with me somehow?
BoltDB based indexes were 592MB in size and levelDB based are 134MB(bucket me_used 64MB). You're asking for raw index files from disk or bucket data?
Well, with the Bleve index I should be able to reproduce the error and figure out which field in which document it was trying to highlight. I understand it contains some customer sensitive data, so if there is some secure way for me to download it that would be ideal.
Passed details over mail
Figured this might a good bug to cross-link as it has some (admittedly old) advice: https://github.com/couchbaselabs/cbft/issues/11
3 types of documents in this bucket(webnutshell), redacted some confidential customer info:
cluster_blob: https://gist.github.com/abhi-bit/6bbbcac3ff75d20b0e00 node_blob: https://gist.github.com/abhi-bit/a8892159fd684c510fb6 customer_blob: https://gist.github.com/abhi-bit/62882eae79602bcda77c
Also, I've noticed the ratio of cbft index files vs bucket mem_used to grow as bucket dataset size grows. From an earlier deployment experience, I've seen a bucket using ~1G in memory created indexes of size 190GB on disk - I've kick started indexing against that bucket couple of days back, will share numbers once the indexing is complete there.