How to conduct dense and sparse sampling ?

doc-doc / HQGA

Video as Conditional Graph Hierarchy for Multi-Granular Question Answering (AAAI'22, Oral)

MIT License

30 stars 4 forks source link

How to conduct dense and sparse sampling ? #1

Closed Fly2flies closed 2 years ago

Fly2flies commented 2 years ago

Hi, thanks you for sharing such a great work. I would like to know how to make dense sampling and sparse sampling after uniformly sampling K clip frames.

After sampling K clip frames c1,...,cK

take these K frame as the middle frame and sample 16 consecutive frames forward and backward to calculate the motion features ?
Or uniformly 32 frames from all frames from c(i-1) to c(i+1) to calculate the motion features ?

Which of the above methods is corresponding to the paper ?

doc-doc commented 2 years ago

Thanks for your interest. We adopt the 1st method and sample 16 frames (8 forward / backward) centered at the key frame.

Fly2flies commented 2 years ago

Thanks for your interest. We adopt the 1st method and sample 16 frames (8 forward / backward) centered at the key frame.

Thank you for your reply. I would also like to know how much memory is needed to store all the TGIF-QA data and how to compress and store the extracted features ?

doc-doc commented 2 years ago

The raw TGIF_full dataset need about 124G. The features are store in .h5 files and need about 40G for each sub-task.