Closed Rzx520 closed 1 year ago
I have another question, which is what basis do you use to determine unknown targets? Is it determined by a threshold?@orrzohar
Hi @Rzx520,
The multiplication of the objectness with the object class probabilities is right here: https://github.com/orrzohar/PROB/blob/10b6518f90495e07b7baf0d1bfa353f0e583eb8e/models/prob_deformable_detr.py#L532
If you mean unknown object predictions - No threshold is applied, I pick the top-k (k=100) most confident predictions (known+uknown) per image: https://github.com/orrzohar/PROB/blob/10b6518f90495e07b7baf0d1bfa353f0e583eb8e/models/prob_deformable_detr.py#L534
This is in-line with previous/current works, and actually originates from D-DETR/DETR, and is common in cases where one wants to evaluate recall (e.g., Recall@100/10 predictions per image, see class-agnostic OD papers like LDET).
Hope this helps! Orr
Will 100 confident predictions contain a lot of background?@orrzohar
I may have misunderstood your meaning. When you choose top k (k=100), will there be 100 predicted results because it contains 100 known+unknown, but I don't see 100 predicted results in your visualization results? How do you determine which is the background, which is unknown, and which is known?@orrzohar thanks
Hi @Rzx520,
The background should be suppressed as the obj_prob for background should be quite low, and therefore when selecting the top-k predictions, the predictions would favor known/unknown objects.
Indeed the model makes 100 predictions per image, but I did not use all 100 for the figures. For a description of how I created Figure 3, please look at issue https://github.com/orrzohar/PROB/issues/11.
If you want to use PROB in a more realistic inference scenario, I would threshold (the known and unknown objects separately) to get more reasonable results, perhaps with NMS.
Let me know if you have any additional questions, Orr
I would also like to know how you determined that this is background and whether there is a specifi obi_prob value is used to determine whether a value below this value is the background?@orrzohar
I don't quite understand the sentence 'cycled through the GT (both known and unknown) objects, and if a model had a prediction of the same class/IoU>0.5, I added that bbox on the image'. Can you explain it in detail?Do unknown objects also have GT? How was the GT of an unknown object obtained?@orrzohar
hi @Rzx520,
obj_prob is the objectness and is low for background and high for objects:
There is no hard threshold on the obj values. There is an implicit one. PROB can make 100*(num_classes+1) predictions per image. When selecting the top-100, the confidence of the 101st prediction can be thought of as a threshold. However, I prefer to conceptualize it that you down weight predictions that are likely background and that way don't predict background as a known/unknown object.
Re GT Unknown annotations - yes, there are! That is how U-Recall is calculated. I visualized it in this way because of the way mAP is calculated, where you first sort the predictions based on confidence and then see if they have a high enough IoU to a GT object. For more on how mAP is calculated, please see this.. So, in order for the qualitative and quantitative results to match up well, they should be created in a simular fashion.
The `unknown' object annotations are those same objects that are hidden from the model at Task t. For example, in M-OWODB, COCO is separated into 4 subsets, where in each task, an additional 20 classes are introduced. So in T1, you have 20 known classes in training and 60 unknown objects during evaluation. In T2, you have 40 classes in the ft dataset, 40 unknown objects in evaluation, and so forth.
Does this make more sense now? Orr
Thank you very much for your reply. If I understand correctly, your unknown GT is in the COCO dataset, except for the trained class (n), the other (80-n) are its unknown GT, right?@orrzohar
Can I understand it this way? It is equivalent to a generalization task, which involves learning the objectness of known class objects to achieve the ability to detect object(known+unknown), and then detect unknown and known objects. I have another question, why did you choose 100 predictions as the prediction result (known+unknown), so it won't lead to too much detection waste? After all, it's rare to detect 100 objects (known+unknown), is it just because DETR gives 100 predictions?Thanks @orrzohar
'AuntimeError: Timed out initializn process group in store based barrier on rank: 2,for key: store based barier key:l. (world size-3. worker counte,timeot=0:30:00)' After training one epoch, training the second epoch reported an error. As mentioned above, do you know how to solve it?@orrzohar
Hi @Rzx520,
Also, would you mind opening a separate issue for the runtime? That way it would be easier to find for future users.
Best, Orr
The multiplication mentioned in the paper in "For class prediction, the learned objectness probability multiplies the classification probabilities to produce the final class predictions" is reflected in the code, but I'm sorry I couldn't find it