filodb / FiloDB

Distributed Prometheus time series database
Apache License 2.0
1.43k stars 225 forks source link

perf(query): add FastColumnFilterMap #1810

Closed alextheimer closed 3 months ago

alextheimer commented 3 months ago

Pull Request checklist

Adds FastColumnFilterMap. Snippet from the javadoc:

 * Bare-bones but (in some cases) ultra-fast ColumnFilterMap.
 * At most two label-values are read during a get() call:
 *   - One to be matched by equality.
 *   - One to be matched by any arbitrary filter.
 * Useful for performance-critical use-cases where this minimal functionality is sufficient:
 *   at most two labels are filtered, and only one supports non-equals filters.

Local tests demonstrate that-- although it is significantly more constrained-- FastColumnFilterMap is about 4x faster than DefaultColumnFilterMap in these limited-filtering scenarios.

alextheimer commented 3 months ago

@yu-shipit I've updated my "10x" claim in the header -- it is closer to 4x more performant with some more-comprehensive tests.

I tested this by loading two ColumnFilterMap implementations with the same real-world mappings. Then I wrote a getSeries method that returned

For each implementation, I called getSeries and columnFilterMap.get(...) 1M times and recorded the total time each took to complete. A typical result (in milliseconds):

DefaultColumnFilterMap: 2199
FastColumnFilterMap:    611