Hi, thanks for open sourcing your excellent work. Can I ask what FPS the video training is done at? In your video inference example you uniformly sample 16 frames, which works out to be slightly less than 1 per second.
Edit: Just found it in the paper, should we always run inference at 1 FPS?
Hi, thanks for open sourcing your excellent work. Can I ask what FPS the video training is done at? In your video inference example you uniformly sample 16 frames, which works out to be slightly less than 1 per second.
Edit: Just found it in the paper, should we always run inference at 1 FPS?