Open 1363555074 opened 2 months ago
The npy files contain processed image data (one frame per video). For example, the output of print(npy_file.shape) gives the size (N, 256, 256, 3), where N is the number of data samples, 256x256 are the width and height, and 3 is the number of channels.
In addition, because the dataset used in this work was relatively small, we processed the relevant data into npy files to speed up training. For larger datasets, it is indeed necessary to directly load images from the dataset. You might refer to other video-related work to write the corresponding dataloader.
Sincerely, I hope I can be of help to you.
hello @Pei-KaiHuang , this is a nice work. I see you load data using .npy format files(in train.py line 77). Can you give me more detail about the .npy format files? Or can you provide a method to directly load images as a dataset?