luceneplusplus / LucenePlusPlus

Lucene++ is an up to date C++ port of the popular Java Lucene library, a high-performance, full-featured text search engine.
luceneplusplus@googlegroups.com
Other
738 stars 232 forks source link

lucene posting list implementation #189

Open patelprateek opened 1 year ago

patelprateek commented 1 year ago

After a query runs , i read that lucene uses filter cache where it encodes the posting list using compressed bitmaps (roaring) , is there any api to retrieve these compressed bitmap rather than iterating over the actual document ids ?

My use case is some filters can have possibly large hits (>10 million) and in such scenarios the compressed bitmaps can possibly help for downstream logic . Any recommendations or pointers for any other approaches ? For a query is it possible to have a quick dry run to get estimated number of documents it will return ?