Closed utility-aagrawal closed 9 months ago
Hello again @utility-aagrawal !
I have a few ideas that can help you, but of course that depends on what is the root cause of the problem. Most of these involve tweaking some of the parameters of your Tracker
instance.
hit_counter_max
: In a previous response I mentioned that one of the steps in Norfair is matching TrackedObject
instances with Detection
instances. If a TrackedObject
doesn't match with a Detection
in a particular frame then we try to match it with new TrackedObject
instances (the not initialized ones).
But if you get no matches at all (with either detections or not initialized tracked_objects), you might still want to try the same thing again in the next frame (compare that TrackedObject with new detections and not initialized tracked_objects).
This hit_counter_max
determines for how many consecutive frames without any matching would you still try to match that object with a Detection
, before saying 'fuck it, I will just only try to compare it with not yet initialized TrackedObject instances'.
This can help if your occlusions don't last for too long and that object could have matched with a detection again before getting destroyed. I don't think this is the first thing you might want to try, but I mention it first because it makes the explanation of the other parameters simpler.reid_hit_counter_max
: As I mentioned before, if a TrackedObject
is not matching with either a Detection
or a not initialized TrackedObject
for many consecutive frames, you will start trying to only match it with not initialized TrackedObject
instances.
This parameter reid_hit_counter_max
determines for how long you will keep trying to match it with not initialized TrackedObject
instances before saying 'fuck it, this object seems to have disappeared forever, I will just destroy it'.
This can help if you think your object is getting destroyed too quickly, and instead you should keep trying to merge it with new not yet initialized tracked objects for a little longer. You might want to try this.initialization_delay
: This makes the Tracker
to wait for longer before finally 'initializing' a new TrackedObject
as a real object that will be returned by the Tracker.update
method. That means that each new TrackedObject will be compared for more frames with the unmatched TrackedObjects (trying to merge them) before deciding that it was actually a new thing. This can help if you think your Tracker should wait a little longer before initializing new TrackedObject instances, so that they remain as not-yet-initialized and keep being compared with unmatched tracked objects for a little longer. reid_distance_threshold
: Maybe your unmatched TrackedObject
doesn't get destroyed too quickly and you actually managed to compare it several times with the new not yet initialized TrackedObject (yay!), but the embedding_distance
between them was so high that the Tracker wasn't able to tell that these two objects actually corresponded to the same thing. If you increase this threshold, you can match objects with greater embedding_distance
. You might want to play with that.embedding_distance
: So instead of looking at one embedding for each TrackedObject
in your embedding_distance
, you might want to use several of their embeddings, compare them and either take the average distance between their embeddings, or take the minimum distance, for example. Of course, doing that might make the comparison slower, since you will be comparing many embeddings to many embeddings, instead of comparing just one embedding with one embedding. I will put some code for the example of taking the minimum.def minimum_embedding_distance(matched_not_init_trackers, unmatched_trackers):
list_of_snd_embedding = []
list_of_fst_embedding = []
# get the embeddings of the unmatched_trackers
if unmatched_trackers.last_detection.embedding is not None:
list_of_snd_embedding.append(unmatched_trackers.last_detection.embedding)
for detection in unmatched_trackers.past_detections:
if detection.embedding is not None:
list_of_snd_embedding.append(detection.embedding)
if len(list_of_snd_embedding)==0:
return 1
# get the embeddings of the matched_not_init_trackers
if matched_not_init_trackers.last_detection.embedding is not None:
list_of_fst_embedding.append(matched_not_init_trackers.last_detection.embedding)
for detection in matched_not_init_trackers.past_detections:
if detection.embedding is not None:
list_of_fst_embedding.append(detection.embedding)
if len(list_of_fst_embedding)==0:
return 1
# compare all the embeddings
distances = []
for embedding1 in list_of_fst_embedding:
for embedding2 n list_of_snd_embedding:
distances.append(1 - cosine(embedding1, embedding2))
# take the minimum (you could take the average with np.mean instead)
return np.min(np.array(distances))
You can keep playing with your embedding_distance
. Maybe if you can combine the past embeddings, like averaging them or something (I haven't seen the VGG-face model for the embeddings, so I don't know what is possible with those), you might be able to avoid the nested for loops that I had in my example which makes the comparison slower.
I haven't tried the particular distance I wrote in this example, but hopefully you can understand what I wanted to say even if it doesn't run as it is written now.
embedding_distance
, also consider playing with the past_detections_length
parameter. Maybe seeing more embeddings for each TrackedObject
in your embedding_distance
might help to see if they are actually the same object or not.Thanks a lot, @aguscas ! I'll try these suggestions and shall keep you posted.
Hi @aguscas ,
I have tried your suggestions and here's my analysis thus far:
Initial tracker parameters: hit_counter_max: 10 reid_hit_counter_max: 500 initialization_delay: 3 reid_distance_threshold: 0.5 embedding_distance: cosine similarity, one to one comparison past_detections_length: 5
1) hit_counter_max: I kept everything else same and just increased the hit_counter_max but it didn't help. No improvements in tracking occluded objects. It created another issue where objects were taking too long to disappear and I had to decrease this parameter.
2) reid_hit_counter_max: I am testing my code on a very simple video with 500 frames and have set this parameter to 500. I don't think it will make any difference if I increase this parameter. So, I didn't try it. Let me know if my interpretation is incorrect.
3) initialization_delay: Since this parameter can only have values between 0 and hit_counter_max, I tried a few combinations. For one set of values (initialization_delay = 30 and hit_counter_max = 60), tracking seemed to work really well but created two issues: i) objects took longer to start, and ii) objects took longer to disappear. I believe to fix these issues, I need to decrease both of these parameters but that makes tracking work poorly
4) reid_distance_threshold: Increasing this parameter seemed to work really well for tracking occluded objects. I had to set this to a very high value of 0.9 for tracking to work. I am not sure if increasing this parameter is the right way to go. What if this causes confusion when there are similar looking objects (in my case faces) in the frame! This makes me wonder if my embedding model is the real culprit. I am going to try a few other face embedding models to see if something else works better.
5) embedding_distance: I tried your minimum_embedding_distance method with default reid_distance_threshold 0.5 but it didn't help. This seems like a great suggestion and I think problem here is reid_distance_threshold. Once I am able to find a better embedding model, i'll give this another try.
6) past_detections_length: I increased this parameter from default value 5 to 30 to 300 and used minimum_embedding_distance method but it didn't help. Similar to point 5 above, I think it could be useful once I find a better embedding model.
I think my next step is to try some other embedding models for reid.
I want to ask a question on the issues I foresee: is there a set of recommended values for these tracker parameters? When I played with initialization_delay and hit_counter_max, I observed that if I increased initialization_delay, objects took longer to start and if I increased hit_counter_max, objects took longer to disappear. Any tips and tricks to handle it?
Thanks a lot for your help with this! I really appreciate it!
mmhhh the results you observed in your analysis make sense, it makes me think that the problem might be either in
embedding_distance
and the reid_distance_threshold
.So you could keep playing with those. When changing the embedding_distance
, you might need to also change the reid_distance_threshold
accordingly, consider both the pair embedding_distance
and reid_distance_threshold
simultaneously. What I mean is, if you define an embedding_distance
, you might want to see what are the typical values you get when comparing two different TrackedObjects
in your footage. So for example, inside the for loop when you iterate over the frames (for example, after calling the Tracker.update
method), you can try something like:
# define all_tracked_objects, which contains all the trackers that the Tracker has available
# this list includes trackers that haven't matched with detections in a while, but are still compared to not-yet-initialized tracked objects
# this list also includes the uninitialized tracked objects
all_tracked_objects = tracker.tracked_objects
if len(all_tracked_objects)>1:
for n, tracked_object1 in enumerate(all_tracked_objects):
for tracked_object2 in all_tracked_objects[n+1:]:
print(f"embedding_distance(obj {tracked_object1.id}, obj {tracked_object2.id}) = {embedding_distance(tracked_object1, tracked_object2)}")
Doing that, you might see what are the typical values of your embedding distance, whenever both tracked objects should match or not. Doing that you might see what could be a good reid_distance_threshold
for that given embedding_distance
. You can make a video and draw the bounding boxes and their ids of each element in the all_tracked_objects
list, to see which id correspond to which object.
What I mentioned before is merely for developing (just to check that you have a good embedding_distance
, and that you picked the reid_distance_threshold
accordingly), once you have done that you should erase that part of the code.
I would love to be able to recommend you some embedding model, but sadly that is something we struggled a lot also when we were looking for models for our Reid demo. If you find any good embedding model, we would love to hear about which one you find.
You might not need tweak other parameters like the reid_hit_counter_max
(at least not that much), since it seems that the core issue is not that the objects were destroyed too soon to compare them to the uninitialized objects, because you mentioned that when you increased the reid_distance_threshold
you actually managed to match them with an unitialized TrackedObject
.
Regarding the recommended values of the tracker parameters, you might find typical values we used in some of our demos, and also the default values we use in the Tracker
class can serve as a reference.
Remember, there is a trade-off in all of these arguments.
initialization_delay
too high, all TrackedObject
instances will take too long to get initialized by the Tracker, and if it is too low then any random detection that didn't match a TrackedObject (for example, any false positive) will immediately initialize a new TrackedObject
, which is undesirable. hit_counter_max
too high, means objects might take too long to get destroyed, and if it gets too low you might destroy it (or at least, stop comparing it with Detection
instances) too fast under any brief occlusion. You might need to consider the FPS of your footage. If you have a high FPS, you can afford to have a higher hit_counter_max
and initialization_delay
, since you will be calling the Tracker.update
method more times for each second in the video, so you can wait a few more iterations before either destroying or initializing TrackedObject
instances.
Thinking of typical values like 25 or 30 fps, I think is good to use values like
hit_counter_max
: in the low two digits, like 15, 20 or 30, so that once a TrackedObject
stops matching Detection
instances, you still have hope for about a second of footage that it will match a Detection
, before losing all hope and you stop trying to match it with Detection
instances.initialization_delay
: one digit, like between 4 and 8, so that you wait about a fifth or a third of second before actually returning any new TrackedObject
to the user.past_detections_length
: depends on your memory, storing more detections per TrackedObject requires more memory. I wouldn't go above one digit numbers anyway, I think by default is 4 which sounds reasonable.reid_distance_threshold
: has to be chosen accordingly to whatever reid_distance_function
you are using. In a similar spirit in which distance_threshold
is chosen accordingly to the distance_function
.reid_hit_counter_max
: depends on how long you expect that your occlusions might be. If you expect that some faces might not be visible for about N frames, then reid_hit_counter_max
should be a little higher than N. So for the 30 fps example, if you expect that your occlusions can take upto 5 seconds, that is 150 frames without seeing the object! So somewhere in the 3 digits might be okay, like maybe 300 to say an example.Thanks a lot, @aguscas ! I'll keep experimenting and shall keep you posted. Keeping this issues open for now but will close within 24 hours if I don't have any follow-up questions.
Hi @aguscas ,
I have a couple of follow-up questions:
1) draw_boxes: I am saving my output as a video and would also like to draw an object's ID along with its bounding box. How do I do that? I see in the code here that draw_ids is True by default but I don't see an ID in my output. I just see the bounding box. I am passing tracked_objects to this method.
2) reid_hit_counter_max: Does this counter change only once per frame? From what I can understand so far, I think the answer is No. For my use case, if I want the ability to identify someone even if they have been away for too long, what are my options? Can I set this to a really high value? or if I want the ability to always identify someone I have seen in the past, is saving this information outside norfair is my only option?
Thanks for your help with this!
text_size
argument to a high value (like 3, 6, or 10). Just to check if the problem is due to the text being drawn so small (in which case, I will fix the way we compute the default text_size
).. Otherwise, I am not sure what could be the problem.
Are you passing the tracked_objects returned by the Tracker.update
method? Or the whole list Tracker.tracked_objects
(which also includes the uninitialized TrackedObject instances)? Maybe @javiber knows what is going on, since he worked on the drawing modules. reid_hit_counter_max
to a huge value if you want to be able to re identify them after it has been lost for too long.
I think if you actually set reid_hit_counter_max = np.inf
, Norfair should work fine without ever destroying the object after an arbitrarily large occlusion, but I'm not sure I would recommend that, since if your footage is too long, then you might be storing an arbitrarily large amount of TrackedObject
instances in your Tracker
since you never destroy anything.A TrackedObject
instance has an attribute called TrackedObject.hit_counter
which is increased by one in the Tracker.update
method whenever it matches a Detection
(saturating at the value of hit_counter_max
, so the TrackedObject
will not increase it's hit counter whenever TrackedObject.hit_counter
is equal to hit_counter_max
), and decreased by one whenever it doesn't match any Detection
.
The TrackedObject.reid_hit_counter
attribute of a TrackedObject
instance is usually set to None
, except when the TrackedObject.hit_counter
stops being positive (due to many frames without matching a Detection
), at that point we set TrackedObject.reid_hit_counter
equal to reid_hit_counter_max
. From that moment, you will not try to match the TrackedObject
with Detections
, you will only try to match it with not yet initialized TrackedObject
instances. When your TrackedObject
doesn't match any uninitialized TrackedObject
, Norfair decreases its reid_hit_counter
by one. That keeps happening in every Tracker.update
iteration, until that TrackedObject
instance matches an uninitialized TrackedObject
(in which case they are merged), or until the reid_hit_counter
turns negative (in which case, the object is destroyed).
Hopefully @javiber might be able to answer your question regarding the drawing, I never had a problem with drawing the ids. I will check that on Monday to see what might be the issue.
One thing I also just realized, is that draw_ids
is not True
by default in the latest release! That was changed afterwards here, so maybe installing Norfair from the master branch might also fix that. You can also try passing the argument draw_ids=True
in your code when calling draw_boxes
.
It will be set to True
by default in the next release.
Thanks @aguscas !
1) Yes, I am passing the tracked_objects returned by the Tracker.update method to the draw_boxes method. I am running an experiment right now and shall let you know if this works.
Installing from master made the ids visible as well. Thanks for all your help! Closing this issue.
@utility-aagrawal Hey did your issue of occlusion solved? if Yes then please let me know how you did that.
@smyousaf1 , you can check my code in the issue #307. It's not perfect but it works pretty good. It's really slow right now but if speed is not a priority for you right now, you can give it a try.
Hi @aguscas ,
I am using reid to rematch faces in case of occlusion. I have implemented something by following reid demo.py example. The only changes I made are:
1) Used embeddings from the VGG-face model detection.embedding = DeepFace.represent(img_path = cut, model_name = embed_model, enforce_detection = False, detector_backend = "retinaface")[0]["embedding"]
2) Used cosine similarity to find distance between faces
This is my embedding_distance method: def embedding_distance(matched_not_init_trackers, unmatched_trackers): snd_embedding = unmatched_trackers.last_detection.embedding
For some reason, it's not working and the tracker always assigns a new id/color to a face after occlusion. I have tried some of the suggestions from this page: https://tryolabs.github.io/norfair/2.2/getting_started/#detection-issues but not luck so far.
Do you have any suggestions for me?
I can share the code and my test video, in case that will provide more clarity. Thanks!