Open zhujx001 opened 1 day ago
For single-attribute range-filtering ANNS, each vector is associated with a numeric attribute. Therefore, the vectors can be pre-sorted in ascending order based on their corresponding attribute values. Specifically, the vector with the smallest attribute value should be ranked first, the second smallest should be ranked second, and so on, with the vector having the largest attribute value ranked last. In our code, we assume that the input vectors are already sorted by their attribute values, so there is no need to input an attribute file. The user may need to pre-sort the vectors by their attributes themselves.
But this only knows the location, but not the attribute value. When performing a range query later, how to determine that the attribute value is within a given range? Although binary division can know the position corresponding to the range, it also needs attribute values for comparison? I don't quite understand this, please guide me
In the code we provided, for simplicity, the query range generated by the code is essentially the left and right boundaries of indices, denoted as ql and qr. The ANN search is performed on the vectors from index ql to qr. For users, they can generate query ranges based on the attribute of a specific dataset. For example, if a dataset's attribute is date, users can generate a range based on dates, such as from one specific day to another. Then, users need to convert the attribute-based range into an index-based range of vectors, ensuring that the vectors within this index range correspond precisely to the generated date range and exclude any vectors with dates outside the range (this requires pre-sorting the vectors in ascending order based on dates). Our code assumes that the attribute-based query range has already been converted into the corresponding index range of vectors (vectors are pre-sorted accordingly).
I see that when building the index, it only mentions that the attributes need to be sorted, but it doesn't seem to specify the specific operation of the attributes. Is it placed in a separate attribute file? Or how to deal with it? There is also no option for attribute files when searching. Why is that? I see that there seems to be no operation to extract attributes in the code.Can you give more detailed instructions?