vitrivr's next-generation retrieval engine. It is capable of extracting and retrieving a wider range of multimedia objects such as audio, video, images or 3d models.
In the XReco context (and potentially for some other applications) we require late-stage filtering for queries. This type of filter removes Retrievables from a result stream generated by some upstream Retriever.
Solution
I'd propose to have a LateFilter that can do late-stage filtering of Retrievables based on some boolean predicate. The way this filter works is roughly as follows:
It fetches the necessary field from the schema.
It performs a comparison of the field's value with a predicate.
It removes all Retrievables that don't match the predicate.
As a side-effect, the LateFilterOperator can append the fetched field to the Retrievable (similar to the FieldLookup).
For the sake of simplicity, a first implementation compares a single field to a single predicate. Logical combinations (AND, OR) are not supported. There are some implementation details to consider and optimise here:
Batch processing vs. stream processing for large result-sets
Attribute matching for structs
Dependencies
No know dependencies.
Context
It is currently not in scope to push-down these filters and combine them with the retrievable operation (which could be done, e.g., in PostgreSQL). This would constitute a next step.
Description
In the XReco context (and potentially for some other applications) we require late-stage filtering for queries. This type of filter removes
Retrievable
s from a result stream generated by some upstreamRetriever
.Solution
I'd propose to have a
LateFilter
that can do late-stage filtering ofRetrievable
s based on some boolean predicate. The way this filter works is roughly as follows:field
from the schema.Retrievable
s that don't match the predicate.LateFilterOperator
can append the fetched field to theRetrievable
(similar to theFieldLookup
).For the sake of simplicity, a first implementation compares a single field to a single predicate. Logical combinations (AND, OR) are not supported. There are some implementation details to consider and optimise here:
struct
sDependencies
No know dependencies.
Context
It is currently not in scope to push-down these filters and combine them with the retrievable operation (which could be done, e.g., in PostgreSQL). This would constitute a next step.