IntelLabs / vdms

VDMS: Your Favorite Visual Data Management System
MIT License
84 stars 31 forks source link

Descriptors knn and constraints do not work together #80

Closed luisremis closed 2 years ago

luisremis commented 5 years ago

When doing Descriptors knn and adding constraints, final result is not filtered correctly. A behavior for this case must be defined and added to the wiki.

Example case:

def test_findDescByBlobAndConstraints(self):

    # Add Set
    set_name = "findwith_blob_const"
    dims = 128
    total = 100
    self.create_set_and_insert(set_name, dims, total)

    db = vdms.vdms()
    db.connect(hostname, port)

    kn = 3

    all_queries = []

    finddescriptor = {}
    finddescriptor["set"] = set_name
    finddescriptor["k_neighbors"] = kn

    results = {}
    results["list"] = ["myid", "_id", "_distance"]
    results["blob"] = True
    finddescriptor["results"] = results

    constraints = {}
    constraints["myid"] = ["==", 205]
    finddescriptor["constraints"] = constraints

    query = {}
    query["FindDescriptor"] = finddescriptor

    all_queries = []
    all_queries.append(query)

    descriptor_blob = []
    x = np.ones(dims)
    x[2] = 2.34 + 30*20
    x = x.astype('float32')
    descriptor_blob.append(x.tobytes())

    response, blob_array = db.query(all_queries, [descriptor_blob])
    print(db.get_last_response_str())

    self.assertEqual(len(blob_array), kn)
    self.assertEqual(descriptor_blob[0], blob_array[0])

    # Check success
    self.assertEqual(response[0]["FindDescriptor"]["status"], 0)
    self.assertEqual(response[0]["FindDescriptor"]["returned"], kn)

    self.assertEqual(response[0]["FindDescriptor"]
                                ["entities"][0]["_distance"], 0)
    self.assertEqual(response[0]["FindDescriptor"]
                                ["entities"][1]["_distance"], 400)
    self.assertEqual(response[0]["FindDescriptor"]
                                ["entities"][2]["_distance"], 400)
luisremis commented 5 years ago

This requires filtering of constraint after the knn is done. This is not a bug, it is an enhancement. I will update the wiki to reflect this is the expected behavior.

prashastk commented 5 years ago

Is there a way to do this before knn is done? We are planning to have entities which contain properties on which we want to apply a constraint and then do a knn on the descriptors connected to them.

luisremis commented 5 years ago

This enhancement here is different from what you need, it seems. This enhancement is about doing a knn and filter the closest neighbors by some constraints in the properties, everything within a single FindDescriptor.

What you are asking, it seems, is constraining the result of knn to only those descriptors linked with elements that result from some other find (findEntity, findImage) in the transaction. We do not support that at this time. Let me go over the implementation in the next couple of days and think how we can include that functionality.

lesterlitch commented 5 years ago

To me this is fairly critical. There are many ways already of doing fast nearest neighbor searching on vectors, and filtering the results can be done in a few lines of python.

A good example of a successful approach here is elastic search, i.e. Boolean search followed by vectorspace search. The same could be achieved here by keeping a feature index and then having a way to fast cut down the candidate set using, say, binary flags.