Closed dribnet closed 2 months ago
Good question!
You can think of mutual_knn(A, B) like the normalized intersection of the nearest neighbor set of A and the nearest neighbor set of C.
Example for k = 100:
If 90 out of 100 of the nearest neighbors overlap between A and B (score of 0.9), there are 10 elements that are in the disjoint set of A and B ( |B-A| = 10 ). More specially, there are 10 elements in the nearest neighbors of B that are not shared with A.
So if B and C also have 90 out of 100 overlap, at worst case 10 of those 90 would be from disjoint of A and B. So the mutual_knn(A, C) would be at least 0.80. So it looks like you can definitely make structures based on the properties of sets.
The "transitive-like" property is $m{AC} \geq m{AB} + m_{AC} - 1$. In other words, you want the overlaps (AB and AC) to each be so large that these overlaps have to overlap. It has a relationship, but is not as strong as what you suggested.
Thanks for that feedback! I better understand the limitations of this approach and will close this issue.
Was playing with the mutual_knn metric trying to build some intuitions. I'm curious whether given three representation spaces A, B and C, if we know the following for a fixed k:
mutual_knn(A, B) = x mutual_knn(B, C) = y
is it possible that we can already know constraints on the mutual_knn(A, C)? Some sort of rule like:
mutual_knn(A, C) >= x * y
As a concrete example - if we know that (A, B) are very highly aligned at 0.9 And (B, C) and very highly aligned at 0.9 Could we conclude that mutual_knn(A, C) is at least 0.81?
My intuition is no 😅. But I thought I would ask out loud anyway since it could be useful to understand what useful algebraic structures these spaces my have.