TengdaHan / CoCLR

[NeurIPS'20] Self-supervised Co-Training for Video Representation Learning. Tengda Han, Weidi Xie, Andrew Zisserman.
Apache License 2.0
286 stars 32 forks source link

Making Kinetics-400 RGB lmdb dataset. #36

Closed bofang98 closed 3 years ago

bofang98 commented 3 years ago

Hi, Tengda, Thanks for your excellent work and detailed instructions for open source codes. Here is my question about making lmdb datasets on Kinetics-400. I noticed that you also make K400 lmdb datasets from your code, not only UCF101 and HMDB51. I want to ask how much hard disk space do I need to extract all K400 RGB frames in jpeg format as you did. And how big is the K400_rgb_lmdb dataset after extracting and making? Looking forward to your reply. Thank you!

TengdaHan commented 3 years ago

As far as I know, lmdb does not compress data. On my side, the lmdb file is 1TB, the frames are also about 1TB. Recently I find extracting frames on-the-fly from video is also fast, e.g. https://github.com/antoine77340/MIL-NCE_HowTo100M/blob/master/video_loader.py the ffmpeg operation.

TengdaHan commented 3 years ago

re-open if have more questions