ant-research / VCSL

Video Copy Segment Localization (VCSL) dataset and benchmark [CVPR2022]
MIT License
119 stars 17 forks source link

Some labels are duplicate #13

Open xuan97916 opened 1 year ago

xuan97916 commented 1 year ago

Thanks for your great work on VCSL dataset! When checking test pair in split_meta_pairs.json, we found about 1% of the labels are self-duplicate (e.g. e45b9eec5ea54e00a2d6e6689cd5fe92-e45b9eec5ea54e00a2d6e6689cd5fe92, dd44c4be9fdc4c95bdf075e7e756294e-dd44c4be9fdc4c95bdf075e7e756294e) or query-reference reverse duplicate (e.g. dd44c4be9fdc4c95bdf075e7e756294e-0a25c04af29940e5b34676b6c1a7eca1(label:[[65, 1, 79, 156]]) and 0a25c04af29940e5b34676b6c1a7eca1-dd44c4be9fdc4c95bdf075e7e756294e(label:[[1, 65, 156, 79]])). And labels of some query-reference reverse duplicate pairs are different (e.g. dd44c4be9fdc4c95bdf075e7e756294e-ca334995c55c45af93dee776799f0433(label:[[52, 9, 82, 44], [68, 84, 82, 96], [9, 1, 16, 7], [46, 97, 49, 100], [47, 45, 49, 47]]) and ca334995c55c45af93dee776799f0433-dd44c4be9fdc4c95bdf075e7e756294e (label:[[9, 52, 44, 82], [1, 9, 7, 16], [97, 46, 100, 49]])). These duplicate pairs may cause inaccurate results. I would like to ask that are these duplicate pairs made by mistake or by designed, thank you!

vcsl-owner commented 1 year ago

Thank you for the feedback! For the "self-duplicate" pairs, we consider them as special and valid cases. The models are expected to consider one video as "copying" itself.   Our annotation process has several stages with interactions between human annotators and algorithms, and occasionally those "query-reference reverse duplicates" could be included (about 1% of the total pairs). Most of these pairs have equivalent labels as your second example, but when the duplicates were processed by different annotator, the labels could be inconsistent (about 0.3% of the total pairs). We remove the "query-reference reverse duplicates" and tested the TransVCL model, the F1-score is 66.41. The difference with the number in the paper (66.51) is pretty small.   To clear up concerns, we will remove those duplicates and update the benchmark. Thanks again!