Closed z-jiaming closed 1 year ago
I checked the sequence.
There are 2 frames in this sequence after excluding the masks that are not in the first frame, as you mentioned. The first frame (P02_09_frame_0000042133.png) contains 'right hand' and 'slat bottle', and the second frame (P02_09_frame_0000042193.png) contains 'right hand'. If you try to evaluate them using the standard DAVIS evaluation code, you will get an error since they exclude the first and last frames. In our evaluation code in this repository, we include the last frame in the evaluation, as we have sparse frames. and we are interested in evaluating all available frames, i.e. in this sequence we will evaluate both masks in frame P02_09_frame_0000042193.png and consider them as the score of the sequence.
Please consider using the evaluation code in this repository. We also use it in our official Codalab evaluation.
Thanks for your reply! I just use the evaluation code in this repository, but, you also exclude the first and last frames in https://github.com/epic-kitchens/VISOR-VOS/blob/b6afe07691fad922ebacb42654046ef24a0fe3ba/evaldavis2017/davis2017/evaluation.py#L85
Am I using it the wrong way?
Thanks for the update! You're correct, the evaluation file might be out of sync (still confirm that our official CodaLab uses the last frame). I updated the file and will clone and repo and double-check later today, but I assume the issue is now resolved by updating the evaluation script to include the last frame. https://github.com/epic-kitchens/VISOR-VOS/blob/7832ad59735725d5647da7398a427301587ae67b/evaldavis2017/davis2017/evaluation.py#L85
Please let me know if you have any further comments or concerns.
I re-clone the evaldavis2017 and find the following two minor issues: 1) 'years' should be 2022 https://github.com/epic-kitchens/VISOR-VOS/blob/7832ad59735725d5647da7398a427301587ae67b/evaldavis2017/davis2017/davis.py#L34
2) delete this https://github.com/epic-kitchens/VISOR-VOS/blob/7832ad59735725d5647da7398a427301587ae67b/evaldavis2017/davis2017/evaluation.py#L87
Just minor issues, but it would be nice if it could be changed.
I use visor_to_davis.py to convert VISOR to davis-format.
But when I eval it, I found 'VISOR_2022/Annotations/480p/P02_09_seq_00046' only has two frames (P02_09_frame_0000042133.png and P02_09_frame_0000042193.png) after converted, because the eval tools remove the first and last frame so that this sequence will be None.
I debug visor_to_davis.py and realize the problem: To generate val set, the setting should be
-keep_first_frame_masks_only 1
. The first frame of P02_09_seq_00046 is 'P02_09_frame_0000042133.png' which contains 'right hand' and 'salt bottle' targets. But the following frames do not have these targets, so their masks are None.What should I do for generating val sets? Thanks a lot!