ofekhta1 / Face-Server

0 stars 0 forks source link

Cluster By Family #4

Open DanielGrin31 opened 1 month ago

DanielGrin31 commented 1 month ago

Use Batch input for the family classifier and optimize similarity calculation

DanielGrin31 commented 1 month ago

~1. Need to optimize by using vector search instead of cosine similarity matrix over all embedding vectors.~ there can be non-similar people from the same family so it may not work

groups=[]
for face_embedding_to_check in all_embeddings:
  face_embedding_to_check = [...]
  similar_embeddings= self.emb_manager.search(...)
  # calculate is_family for all similar vectors

   # add to groups
  1. retrain family classifier.
  2. False positive tends to merge family clusters
  3. use probabilities for dbscan