Closed mohamedmossadaeye closed 3 years ago
Hi @mohamedmossadaeye, in the actual annotation process, a video is labeled by many annotators (users) to avoid bias, because different people have their own preference for "important" frames. Each annotator produces an N-dim binary vector, so U annotators together will generate a UxN binary matrix. The scores for each frame will be averaged across all annotators and then we get the ground truth score for each frame.
great , what tools used to annotate the videos ?
Sorry but I don't have any recommendation for it. We didn't make a dataset. We only used the public ones. I made a quick search and find this repo https://github.com/video-annotator/pythonvideoannotator. Maybe it could help you?
I imagined that json in
custom_data
folder would model total frames as binary list for example assume we have 10 frames in the video and the most important segments is from frame [3 >> 6] and [9 >> 10] then the annotation would be [0,0,1,1,1,1,0,0,1,1] in other words seems confused about this statement in readme.md fileThe user summary of a video is a UxN binary matrix, where U denotes the number of annotators and N denotes the number of frames in the original video
why to replicate framesU
times and what isU