Open desmondcheongzx opened 3 years ago
Also, could apply the entire conditional to each series rather than iterating many times over
Right, so there should be a third field called filters
that would store a vector of lambda functions to apply over the data points. When evaluating a variable + metric value predicate, we get the bitmap of relevant series plus the filter to apply.
We can delay applying this conditional as long as the resultSet is involved in AND operations. Once there's an OR operation, we have no choice but to unpack both conditions.
It's hard to evaluate without a proper benchmark, but testing query evaluation on larger data sets is still very very slow even with #13
Currently we perform unions and intersections over individual records. This came as a result of needing to filter records by metrics/timestamps at the leaf level. However, this is potentially costly when we're still filtering results by
labelKey
andlabelValue
pairs.Instead, our
ResultSet
could have two additional fields: abitset
field containing the relevant series in a roaring bitmap, andunpacked
boolean, denoting whether the roaring bitmap has been unpacked into the vector of records.Two
ResultSet
s that haven't been unpacked can be unioned/intersected on their bitsets alone. When aResultSet
has been unpacked, we must unpack any otherResultSet
that it is unioned/intersected with. Finally, before returning our results, we must ensure that theResultSet
has been unpacked.