Open Lijiaxin0111 opened 3 months ago
Dear Jiaxin,
thanks for your interest in our work! Indeed a bounding-box based approach is not ideal for objects that split into parts. In this case we still went with it, since such object constitute only a fraction of all instances in VOST. Average optical flow inside the object mask would have been a better approximation, perhaps.
Great work, Congratulations!
But I noticed , in the paper, it was mentioned that "To this end, we follow [16] and compute the distance between the centers of bounding boxes enclosing the objects mask in frames t and t − 1 in the horizontal dimension as d^tx = \frac{ ||x{t−1}−xt||} { a{t−1}} , where a_{t−1} is the bounding box area in frame t − 1."
I am very curious that when the object is separate into many parts (such as cut, break off) , how can you calculate the box? Did you find the box which completely cover all parts?