Closed Tsunehiko closed 1 year ago
You got the labels mixed up. Zero is the label for person and for focal loss we do not have a specific background label. Every query outputs binary per-class predictions and the threshold self.detection_obj_score_thresh
decides if it is considered background or not.
Thank you for your reply. I understood the background class, but I don't know why did you use result['labels'][-self.num_object_queries:] == 0
. Is it to limit the query to the person class? (If so, I don't know why you set the class num to the large value of 20 in MOT17.)
The project is targeted at pedestrian/person tracking that is why we only allow those predictions. But we trained on more than a single output neuron as this is supposed stabilize the training. Effectively, we are only using the very first neuron though.
I understood it. Thank you for your detailed explanation.
Thank you for the wonderful work. I have read the paper and code, and have a question about track query initialization.
How do you select the initial track queries from the object queries in the evaluation? In the paper, the following sentences are stated,
After reading this, I expected to add the object queries with non-zero class labels to the new track queries. However, when looking at the code, it seems to be extracting only those that match 0.
I believe what is written in the paper is correct, but this implementation is beyond my understanding, could you please tell me what is happening in the implementation? Or if I have extracted the wrong part of the implementation, please let me know the correct part.