Closed synsin0 closed 1 year ago
Hi @synsin0, I need a few more details about your specific problem to be sure, but I have a suspicion that either one or both of the following is occurring;
The relative weight between unknowns and knowns in your dataset is different than in COCO. This would be caused if your dataset has, on average, fewer objects per frame. You can fix this by further down-weighting the 'empty_whieght' in the 'SetCriterion' , which downlights the 'background/unknown object' logit. As you are making 100 predictions per image, and let's say on average you have 5 objects per image, that means that the relative weight should be ~0.05 as you have effectively 95 'background/unknown object' per image. A training-free approach to see if this is the case is simply down-weighting the C+1 logit by some constant, and seeing if this improves.
Your prob-objectness head predictions are poor, and you are predicting 100 objects per image. You could potentially fix this by changing the 'obj_temp' parameter (you do NOT need to retrain to evaluate with different temperatures!). If this does not improve the objectness prediction, then the only remaining cause would be the relative weighting of the objectness, cls, and bbox losses. I actually found that the models were not that sensitive to this parameter, but perhaps the transition to 3D is different (this needs retraining).
In your case, I would first troubleshoot to find the root of the issue. Is the objectness predicting that all proposals are objects? if so, #2 is your best bet. If the model is predicting both known and unknown objects, but the unknown object confidence is simply always higher than the known class, then #1 is your best bet.
Best, Orr
Thanks for your timely response. Your two suggestions are very likely explanations for my poor performance. For each sample I have the query number 900, the matched gt instances are around 30-50. I am curious whether I may set the 'empty_weight' as adaptive: num_matched / num_boxes for each frame, which may fit better. I will have a try. The second, I think the objectness determines whether it is an object or background, but not decides it is unknown or known. Is the objectness only multiplies on unknown category or all categories? I am a littile confused.
Hi @synsin0, The objectness multiplies all categories (although you can change it to multiply only unknown object logit, this may be better in your case).
The objectness determines whether a proposal is an object or background. However, let's say the objectness always predicts 1 (e.g., if the temperature is too low). Then, none of the queries will be filtered out as background.
At the same time, let's say that the relative weighting is such that the unknown objects' predictions are more confident than the known objects. If both of these are the case, then all of the detections will be unknown objects!
If instead, the objectness is set up correctly, but the weighting is still off - then (for some images with 30 objects) 70 of the 100 detections would be suppressed as background. The remaining 30 proposals would first be classified as unknowns, but then the same proposal would be characterized with some known object - so you would have both unknown and known detections for the known objects.
Hope this helps, Orr
Thanks again for your response! Now I have a general idea about the module design in your paper. I have tested the output of my 'prob' and 'out_logits', it doesn't work as well as in your project settings. Then I will retrain my model to see some possible changes. This issue will be set as closed.
Thanks for your great work. I try to integrate prob-OWOD to 3D domains, but I am confronted with one problem. After training the cls reg branch produces the known categories between [-3,0], and the unknown is >0, after sigmoid and topk selection all objects are classified as unknown. May you give some insight on how to regulate the model from avoiding classifying knowns as unknowns?