Reid in case of occlusion/object going out and coming back in the frame

Hi @aguscas ,

I am using reid to rematch faces in case of occlusion. I have implemented something by following reid demo.py example. The only changes I made are:

1) Used embeddings from the VGG-face model detection.embedding = DeepFace.represent(img_path = cut, model_name = embed_model, enforce_detection = False, detector_backend = "retinaface")[0]["embedding"]

2) Used cosine similarity to find distance between faces

This is my embedding_distance method: def embedding_distance(matched_not_init_trackers, unmatched_trackers): snd_embedding = unmatched_trackers.last_detection.embedding

if snd_embedding is None:
    for detection in reversed(unmatched_trackers.past_detections):
        if detection.embedding is not None:
            snd_embedding = detection.embedding
            break
    else:
        return 1

for detection_fst in matched_not_init_trackers.past_detections:
    if detection_fst.embedding is None:
        continue

    distance = 1 - cosine(snd_embedding, detection_fst.embedding)

    if distance < 0.5:
        return distance
return 1

For some reason, it's not working and the tracker always assigns a new id/color to a face after occlusion. I have tried some of the suggestions from this page: https://tryolabs.github.io/norfair/2.2/getting_started/#detection-issues but not luck so far.

Do you have any suggestions for me?

I can share the code and my test video, in case that will provide more clarity. Thanks!

Hello again @utility-aagrawal !

I have a few ideas that can help you, but of course that depends on what is the root cause of the problem. Most of these involve tweaking some of the parameters of your Tracker instance.

Increase the hit_counter_max: In a previous response I mentioned that one of the steps in Norfair is matching TrackedObject instances with Detection instances. If a TrackedObject doesn't match with a Detection in a particular frame then we try to match it with new TrackedObject instances (the not initialized ones). But if you get no matches at all (with either detections or not initialized tracked_objects), you might still want to try the same thing again in the next frame (compare that TrackedObject with new detections and not initialized tracked_objects). This hit_counter_max determines for how many consecutive frames without any matching would you still try to match that object with a Detection, before saying 'fuck it, I will just only try to compare it with not yet initialized TrackedObject instances'. This can help if your occlusions don't last for too long and that object could have matched with a detection again before getting destroyed. I don't think this is the first thing you might want to try, but I mention it first because it makes the explanation of the other parameters simpler.
Increase the reid_hit_counter_max: As I mentioned before, if a TrackedObject is not matching with either a Detection or a not initialized TrackedObject for many consecutive frames, you will start trying to only match it with not initialized TrackedObject instances. This parameter reid_hit_counter_max determines for how long you will keep trying to match it with not initialized TrackedObject instances before saying 'fuck it, this object seems to have disappeared forever, I will just destroy it'. This can help if you think your object is getting destroyed too quickly, and instead you should keep trying to merge it with new not yet initialized tracked objects for a little longer. You might want to try this.
Increase the initialization_delay: This makes the Tracker to wait for longer before finally 'initializing' a new TrackedObject as a real object that will be returned by the Tracker.update method. That means that each new TrackedObject will be compared for more frames with the unmatched TrackedObjects (trying to merge them) before deciding that it was actually a new thing. This can help if you think your Tracker should wait a little longer before initializing new TrackedObject instances, so that they remain as not-yet-initialized and keep being compared with unmatched tracked objects for a little longer.
Increasing the reid_distance_threshold: Maybe your unmatched TrackedObject doesn't get destroyed too quickly and you actually managed to compare it several times with the new not yet initialized TrackedObject (yay!), but the embedding_distance between them was so high that the Tracker wasn't able to tell that these two objects actually corresponded to the same thing. If you increase this threshold, you can match objects with greater embedding_distance. You might want to play with that.
Changing your embedding_distance: So instead of looking at one embedding for each TrackedObject in your embedding_distance, you might want to use several of their embeddings, compare them and either take the average distance between their embeddings, or take the minimum distance, for example. Of course, doing that might make the comparison slower, since you will be comparing many embeddings to many embeddings, instead of comparing just one embedding with one embedding. I will put some code for the example of taking the minimum.

def minimum_embedding_distance(matched_not_init_trackers, unmatched_trackers):
    list_of_snd_embedding = []
    list_of_fst_embedding = []

    # get the embeddings of the unmatched_trackers
    if unmatched_trackers.last_detection.embedding is not None:
        list_of_snd_embedding.append(unmatched_trackers.last_detection.embedding)
    for detection in unmatched_trackers.past_detections:
        if detection.embedding is not None:
            list_of_snd_embedding.append(detection.embedding)

    if len(list_of_snd_embedding)==0:
        return 1

    # get the embeddings of the matched_not_init_trackers
    if matched_not_init_trackers.last_detection.embedding is not None:
        list_of_fst_embedding.append(matched_not_init_trackers.last_detection.embedding)
    for detection in matched_not_init_trackers.past_detections:
        if detection.embedding is not None:
            list_of_fst_embedding.append(detection.embedding)

    if len(list_of_fst_embedding)==0:
        return 1

    # compare all the embeddings
    distances = []
    for embedding1 in list_of_fst_embedding:
        for embedding2 n list_of_snd_embedding:
            distances.append(1 - cosine(embedding1, embedding2))

    # take the minimum (you could take the average with np.mean instead)
    return np.min(np.array(distances))

You can keep playing with your embedding_distance. Maybe if you can combine the past embeddings, like averaging them or something (I haven't seen the VGG-face model for the embeddings, so I don't know what is possible with those), you might be able to avoid the nested for loops that I had in my example which makes the comparison slower.

I haven't tried the particular distance I wrote in this example, but hopefully you can understand what I wanted to say even if it doesn't run as it is written now.

If you are changing the embedding_distance, also consider playing with the past_detections_length parameter. Maybe seeing more embeddings for each TrackedObject in your embedding_distance might help to see if they are actually the same object or not.

Thanks a lot, @aguscas ! I'll try these suggestions and shall keep you posted.

Hi @aguscas ,

I have tried your suggestions and here's my analysis thus far:

Initial tracker parameters: hit_counter_max: 10 reid_hit_counter_max: 500 initialization_delay: 3 reid_distance_threshold: 0.5 embedding_distance: cosine similarity, one to one comparison past_detections_length: 5

1) hit_counter_max: I kept everything else same and just increased the hit_counter_max but it didn't help. No improvements in tracking occluded objects. It created another issue where objects were taking too long to disappear and I had to decrease this parameter.

2) reid_hit_counter_max: I am testing my code on a very simple video with 500 frames and have set this parameter to 500. I don't think it will make any difference if I increase this parameter. So, I didn't try it. Let me know if my interpretation is incorrect.

3) initialization_delay: Since this parameter can only have values between 0 and hit_counter_max, I tried a few combinations. For one set of values (initialization_delay = 30 and hit_counter_max = 60), tracking seemed to work really well but created two issues: i) objects took longer to start, and ii) objects took longer to disappear. I believe to fix these issues, I need to decrease both of these parameters but that makes tracking work poorly

4) reid_distance_threshold: Increasing this parameter seemed to work really well for tracking occluded objects. I had to set this to a very high value of 0.9 for tracking to work. I am not sure if increasing this parameter is the right way to go. What if this causes confusion when there are similar looking objects (in my case faces) in the frame! This makes me wonder if my embedding model is the real culprit. I am going to try a few other face embedding models to see if something else works better.

5) embedding_distance: I tried your minimum_embedding_distance method with default reid_distance_threshold 0.5 but it didn't help. This seems like a great suggestion and I think problem here is reid_distance_threshold. Once I am able to find a better embedding model, i'll give this another try.

6) past_detections_length: I increased this parameter from default value 5 to 30 to 300 and used minimum_embedding_distance method but it didn't help. Similar to point 5 above, I think it could be useful once I find a better embedding model.

I think my next step is to try some other embedding models for reid.

I want to ask a question on the issues I foresee: is there a set of recommended values for these tracker parameters? When I played with initialization_delay and hit_counter_max, I observed that if I increased initialization_delay, objects took longer to start and if I increased hit_counter_max, objects took longer to disappear. Any tips and tricks to handle it?

Thanks a lot for your help with this! I really appreciate it!

mmhhh the results you observed in your analysis make sense, it makes me think that the problem might be either in

The embedding model for Reid.
The embedding_distance and the reid_distance_threshold.
Both of them

So you could keep playing with those. When changing the embedding_distance, you might need to also change the reid_distance_threshold accordingly, consider both the pair embedding_distance and reid_distance_threshold simultaneously. What I mean is, if you define an embedding_distance, you might want to see what are the typical values you get when comparing two different TrackedObjects in your footage. So for example, inside the for loop when you iterate over the frames (for example, after calling the Tracker.update method), you can try something like:

# define all_tracked_objects, which contains all the trackers that the Tracker has available
# this list includes trackers that haven't matched with detections in a while, but are still compared to not-yet-initialized tracked objects
# this list also includes the uninitialized tracked objects
all_tracked_objects = tracker.tracked_objects 

if len(all_tracked_objects)>1:
    for n, tracked_object1 in enumerate(all_tracked_objects):
        for tracked_object2 in all_tracked_objects[n+1:]:
            print(f"embedding_distance(obj {tracked_object1.id}, obj {tracked_object2.id}) = {embedding_distance(tracked_object1, tracked_object2)}")

Doing that, you might see what are the typical values of your embedding distance, whenever both tracked objects should match or not. Doing that you might see what could be a good reid_distance_threshold for that given embedding_distance. You can make a video and draw the bounding boxes and their ids of each element in the all_tracked_objects list, to see which id correspond to which object.

What I mentioned before is merely for developing (just to check that you have a good embedding_distance, and that you picked the reid_distance_threshold accordingly), once you have done that you should erase that part of the code.

I would love to be able to recommend you some embedding model, but sadly that is something we struggled a lot also when we were looking for models for our Reid demo. If you find any good embedding model, we would love to hear about which one you find.

You might not need tweak other parameters like the reid_hit_counter_max (at least not that much), since it seems that the core issue is not that the objects were destroyed too soon to compare them to the uninitialized objects, because you mentioned that when you increased the reid_distance_threshold you actually managed to match them with an unitialized TrackedObject.

Regarding the recommended values of the tracker parameters, you might find typical values we used in some of our demos, and also the default values we use in the Tracker class can serve as a reference.

Remember, there is a trade-off in all of these arguments.

As you saw, if you set initialization_delay too high, all TrackedObject instances will take too long to get initialized by the Tracker, and if it is too low then any random detection that didn't match a TrackedObject (for example, any false positive) will immediately initialize a new TrackedObject, which is undesirable.
Similarly you saw that getting the hit_counter_max too high, means objects might take too long to get destroyed, and if it gets too low you might destroy it (or at least, stop comparing it with Detection instances) too fast under any brief occlusion.

You might need to consider the FPS of your footage. If you have a high FPS, you can afford to have a higher hit_counter_max and initialization_delay, since you will be calling the Tracker.update method more times for each second in the video, so you can wait a few more iterations before either destroying or initializing TrackedObject instances.

Thinking of typical values like 25 or 30 fps, I think is good to use values like

hit_counter_max: in the low two digits, like 15, 20 or 30, so that once a TrackedObject stops matching Detection instances, you still have hope for about a second of footage that it will match a Detection, before losing all hope and you stop trying to match it with Detection instances.
initialization_delay: one digit, like between 4 and 8, so that you wait about a fifth or a third of second before actually returning any new TrackedObject to the user.
past_detections_length: depends on your memory, storing more detections per TrackedObject requires more memory. I wouldn't go above one digit numbers anyway, I think by default is 4 which sounds reasonable.
reid_distance_threshold: has to be chosen accordingly to whatever reid_distance_function you are using. In a similar spirit in which distance_threshold is chosen accordingly to the distance_function.
reid_hit_counter_max: depends on how long you expect that your occlusions might be. If you expect that some faces might not be visible for about N frames, then reid_hit_counter_max should be a little higher than N. So for the 30 fps example, if you expect that your occlusions can take upto 5 seconds, that is 150 frames without seeing the object! So somewhere in the 3 digits might be okay, like maybe 300 to say an example.

Thanks a lot, @aguscas ! I'll keep experimenting and shall keep you posted. Keeping this issues open for now but will close within 24 hours if I don't have any follow-up questions.

Hi @aguscas ,

I have a couple of follow-up questions:

1) draw_boxes: I am saving my output as a video and would also like to draw an object's ID along with its bounding box. How do I do that? I see in the code here that draw_ids is True by default but I don't see an ID in my output. I just see the bounding box. I am passing tracked_objects to this method.

https://github.com/tryolabs/norfair/blob/009a1b171ab14336d79d7b7b02dfa5f45066c79e/norfair/drawing/draw_boxes.py#L22

2) reid_hit_counter_max: Does this counter change only once per frame? From what I can understand so far, I think the answer is No. For my use case, if I want the ability to identify someone even if they have been away for too long, what are my options? Can I set this to a really high value? or if I want the ability to always identify someone I have seen in the past, is saving this information outside norfair is my only option?

Thanks for your help with this!

Mmhh... that's odd. Try setting the text_size argument to a high value (like 3, 6, or 10). Just to check if the problem is due to the text being drawn so small (in which case, I will fix the way we compute the default text_size).. Otherwise, I am not sure what could be the problem. Are you passing the tracked_objects returned by the Tracker.update method? Or the whole list Tracker.tracked_objects (which also includes the uninitialized TrackedObject instances)? Maybe @javiber knows what is going on, since he worked on the drawing modules.
You can set the reid_hit_counter_max to a huge value if you want to be able to re identify them after it has been lost for too long. I think if you actually set reid_hit_counter_max = np.inf, Norfair should work fine without ever destroying the object after an arbitrarily large occlusion, but I'm not sure I would recommend that, since if your footage is too long, then you might be storing an arbitrarily large amount of TrackedObject instances in your Tracker since you never destroy anything.

A brief explanation of the hit_counter and reid_hit_counter

A TrackedObject instance has an attribute called TrackedObject.hit_counter which is increased by one in the Tracker.update method whenever it matches a Detection (saturating at the value of hit_counter_max, so the TrackedObject will not increase it's hit counter whenever TrackedObject.hit_counter is equal to hit_counter_max), and decreased by one whenever it doesn't match any Detection.
The TrackedObject.reid_hit_counter attribute of a TrackedObject instance is usually set to None, except when the TrackedObject.hit_counter stops being positive (due to many frames without matching a Detection), at that point we set TrackedObject.reid_hit_counter equal to reid_hit_counter_max. From that moment, you will not try to match the TrackedObject with Detections, you will only try to match it with not yet initialized TrackedObject instances. When your TrackedObject doesn't match any uninitialized TrackedObject, Norfair decreases its reid_hit_counter by one. That keeps happening in every Tracker.update iteration, until that TrackedObject instance matches an uninitialized TrackedObject (in which case they are merged), or until the reid_hit_counter turns negative (in which case, the object is destroyed).

Hopefully @javiber might be able to answer your question regarding the drawing, I never had a problem with drawing the ids. I will check that on Monday to see what might be the issue.

One thing I also just realized, is that draw_ids is not True by default in the latest release! That was changed afterwards here, so maybe installing Norfair from the master branch might also fix that. You can also try passing the argument draw_ids=True in your code when calling draw_boxes.

It will be set to True by default in the next release.

Thanks @aguscas !

1) Yes, I am passing the tracked_objects returned by the Tracker.update method to the draw_boxes method. I am running an experiment right now and shall let you know if this works.

Installing from master made the ids visible as well. Thanks for all your help! Closing this issue.

@utility-aagrawal Hey did your issue of occlusion solved? if Yes then please let me know how you did that.

@smyousaf1 , you can check my code in the issue #307. It's not perfect but it works pretty good. It's really slow right now but if speed is not a priority for you right now, you can give it a try.

tryolabs / norfair

Reid in case of occlusion/object going out and coming back in the frame #298

A brief explanation of the hit_counter and reid_hit_counter