Open dinglei8908 opened 2 years ago
Not done yet, but possible. We still need to figure out if we need to make changes on pretraining, or simply adapt the pretrained models to this task. The simplest way might be treating the average of frames as an image.
suppose we can extract several frames from video, any suggestions about this?