Open Jianqiuer opened 8 months ago
@Jianqiuer
We refer the SAM's conclusion, that box prediction is not ambiguous. As a result, we always select the first mask token for box.
Score subtraction is a simple vectorized implementation for SAM's and ours routing strategy. We refer the ONNX wrapper code for SAM. [Code].
We use a routing strategy slightly different from SAM's implementation. We rethink the ambiguity issue for K-points prompt. Typically, estimating an accurate "K" is non-trivial for both training and evaluating phases. For simplicity, we always select the top-ranked mask token for points.
I've been exploring the implementation of the heuristic routing strategy within your project and came across a specific operation that piqued my curiosity. Specifically, I noticed that the strategy doesn't directly utilize the first bounding box (IOU score index 0) prediction result for routing decisions. Instead, there seems to be an operation where the initial mask prediction result is modified by subtracting 1000 from it.
Could you please clarify the underlying principle behind this approach? I'm particularly interested in understanding:
The rationale for not using the first BBox prediction result as-is for heuristic routing.
The significance and expected impact of subtracting 1000 from the initial mask prediction result.
I believe understanding this could greatly enhance my comprehension of the heuristic routing strategy's design and its implications on the system's overall performance.
Looking forward to your insights.
Thank you for your time and consideration.