tokee / lucene-solr

High cardinality faceting (SOLR-5894)
http://tokee.github.io/lucene-solr/
7 stars 1 forks source link

Add cached external terms for low-cardinality fields #21

Closed tokee closed 9 years ago

tokee commented 10 years ago

Depending on ordinal mapping, index setup and storage system, resolving ordinals to terms might take a non-trivial amount of time if facet.limit is high. Having an global ordinal -> external term cache would make that near instantaneous. The trade-of is memory, so this is probably a bad idea to use for medium- to high-cardinality fields.

Set facet.sparse.termlookup.maxcache to a value higher than the amount of unique terms in the field to enable the cache. Only the amount of memory necessary to hold the actual terms are used, so specifying 2147483648 to force caching is fine.

tokee commented 10 years ago

This has been implemented for single-shard faceting with DocValues.

Extension to multi-shard and non-docvalues is nearly the same code, so no problems are expected with that.

tokee commented 9 years ago

This will not be extended to non-DocValues and Solr 5.x is DocValues-API only for faceting.