Closed javadmozaffari closed 3 months ago
Hi, sorry for the late reply.
I think it might be hard, because the boundary matching mechanism require all frames as the input. But for saving the memory, I think you can try 2 ways to reduce the temporal size.
But the pretrained model is not trained with this preprocessing, so it might not perform well if you want to use this way to evaluate.
Hello,
In this new Temporal Forgery Localization model, the entire video is used as input. The existing model proposed by the authors has demonstrated promise in achieving accurate results. Although the current input strategy consists of the entire video, it may pose a challenge in terms of memory consumption, especially for large datasets or videos with high resolution frames. Would it be possible to modify the Temporal Forgery Localization model to accept individual frames instead of the entire video? This would result in a reduction in the amount of RAM required.