Thank you for your great work, I have 2 questions.
How to train RPN separately? The RPN in the article seems to be somewhat related to the OLN network.
If you use the Regionclip model, you can actually do open vocabulary object detection directly. Why do you still use unknown to define a new class in the end?
Looking forward to your reply
To train the RPN separately, we commented out lines 100-102 and 105-109 in the code at https://github.com/binyisu/food/blob/main/food/modeling/meta_arch/rcnn.py, keeping only the RPN-related loss functions. We then modified the RPN structure and ran the base class training code." We leveraged the advantage of the centerness loss in the OLN network, which determines whether a proposal is an object based on its shape, to mitigate the issue of the original RPN being falsely class-agnostic.
RegionCLIP relies on the scope of the predefined vocabulary subjectively and assumes that the objects of interest are known (i.e., what we know). However, in real-world scenarios such as safe autonomous driving, there are often unexpected obstacles (i.e., what we don’t know) that fall outside the predefined vocabulary. These objectively present but subjectively unanticipated objects can lead to false or missed detections. Therefore, it is crucial to train detectors that can recognize known objects while also rejecting unknown ones.
Thank you for your great work, I have 2 questions.
Looking forward to your reply