mx-mark / VideoTransformer-pytorch

PyTorch implementation of a collections of scalable Video Transformer Benchmarks.
272 stars 34 forks source link

How to dataloader? #25

Open SuperGentry opened 2 years ago

SuperGentry commented 2 years ago

Hello, thank you very much for your outstanding work. I was new to computer vision, and I didn't see how the images were loaded into the model. Could you tell me how to extract 16 frames from the video and input them into the VIVIT model? Looking forward to your reply

mx-mark commented 1 year ago

@SuperGentry We use decord to extract the video frames. And the details about how to read the frames from the video, you can check its official site https://github.com/dmlc/decord. After loading the video frames, the PyTorch use dataloader to organize the data for training, you can check the document from the Pytorch https://pytorch.org/docs/stable/data.html?highlight=dataloader#.

SuperGentry commented 1 year ago

thank you very much!