Closed lin-nie closed 2 months ago
Hi lin-nie,
The annotations in the official channel are "action recognition" temporal segments, the segments that we want to recognise the human action. The annotations in the Dropbox (*_feature_times.pkl) are start / end timestamps with the sliding window(stride 0.2 sec)
For running TIM, you need to extract the features using the *_feature_times.pkl files, since TIM needs dense features of untrimmed videos. The annotations in the official channel are used for training TIM.
Jaesung
Hi, thank you very much for your prompt reply. Let me summarize your answer:
For TIM-Omnivore backbone:
For TIM-VideoMAE backbone:
For TIM-recognition:
For TIM-detection:
Do I understand correctly? Jaesung, thank you very much!
Nie 2024.08.22
Yes.
In the second point, _context_pickle --> you need to put the path of _feature_times.pkl files.
So nice for your prompt reply, I got your point.
Thank you very much!
Jaesung, hope you have a nice day.
Nie 2024.08.22
Hello, thank you very much for providing such an outstanding and impressive model. I am currently attempting to reproduce your results.
would like to understand why there are two different sources for obtaining the annotations: 1) one provided through the official channel: https://github.com/epic-kitchens/epic-kitchens-100-annotations, and 2) the other provided in your TIM GitHub project: https://www.dropbox.com/scl/fi/xs6muwf67a5h9ql30jart/annotations.zip?rlkey=iw6b4w9n4brcpvygoksmrvf4n&e=1&st=j6c1exut&dl=0.
Could you please explain the differences between the annotations obtained from these two sources?
And also, when I extracted features using videoMAE and Omnivore, I used the first source, which is the official annotation. I discovered later that you provided a second set of annotations. Could you please advise on how this might affect the extracted features? Should I re-extract the features using the second set of annotations?
Really thank you for your help, wait for your message!
Nie