TRI-ML / VOST

Code for the VOST dataset
23 stars 2 forks source link

Qusetion About the Fast Motion #7

Open Lijiaxin0111 opened 3 months ago

Lijiaxin0111 commented 3 months ago

Great work, Congratulations!

But I noticed , in the paper, it was mentioned that "To this end, we follow [16] and compute the distance between the centers of bounding boxes enclosing the objects mask in frames t and t − 1 in the horizontal dimension as d^tx = \frac{ ||x{t−1}−xt||} { a{t−1}} , where a_{t−1} is the bounding box area in frame t − 1."

I am very curious that when the object is separate into many parts (such as cut, break off) , how can you calculate the box? Did you find the box which completely cover all parts?

pvtokmakov commented 2 months ago

Dear Jiaxin,

thanks for your interest in our work! Indeed a bounding-box based approach is not ideal for objects that split into parts. In this case we still went with it, since such object constitute only a fraction of all instances in VOST. Average optical flow inside the object mask would have been a better approximation, perhaps.