question about train on my own datasets

Thanks for your great work, recently, I met some problems when using this excellent work, Here is my question. dataset basic info: fps:5, video duration 1s-25s; feature extraction method: video mae, 5 frames for each feat and feat stride is 1 config: num frames 5, feat stride 1, max_seq_len: 128, backbone arch [2, 2, 1], regression range[[0, 4], [4, 10000]], ...

In inference stage, I can only got the first 5 seconds results with reasonable scores, for rest part of the video, only got very low scores, like 0.008, and all videos for test have the same problem. I am not sure that there are some mistakes in my training config that cause this problem. And I wish I can get some suggestions from you.

happyharrycn / actionformer_release

question about train on my own datasets #137