Open hahapt opened 2 years ago
I have the same concern. In the bottom of Figure.5, how MinVIS associate the dog between t1 and t3. In my understandings, the dog query Q1 in t1 assoicate the empty query Q12 in t2, and the dog query Q3 in t3 associate the empty query Q32 in t2. In this figure, the MinVIS successfully assocaite the Q1 and Q3, can this suggests that Q12 == Q32 ? I don't understand how it works.
In real world, sometimes objects may be completely occluded in few frames. For example, a person is blocked by a large truck, can MinVIS match the person before occluded and after occluded?