happyharrycn / actionformer_release

Code release for ActionFormer (ECCV 2022)
MIT License
419 stars 77 forks source link

Question about regression ranges #71

Closed zivnachum closed 1 year ago

zivnachum commented 1 year ago

Hi, I have a dataset with actions that are much longer than the action in THUMOS. I'm trying to understand how to choose regression ranges that will fit my data best. Any suggestions on how to do it and how to interpret their values? Thanks!

tzzcl commented 1 year ago

For your questions, I think you should first know the statistics about the action duration. I don't know the exact action duration of your datasets. The regression range in ActionFormer means the maximum regression time (sec) from the current timestamp to the starting/ending of a certain action. By the way, ActionFormer can handle long action instances by increasing the number of multi-level features (actions around 10 mins, please refer to the Ego4D configs, especially the backbone_arch and regression_range for more details.)

zivnachum commented 1 year ago

Thanks for the response! I have another couple of follow-up questions.

  1. If regression range means the maximum regression time from the current timestamp to the start/end of the action, does it mean the duration of an event?
  2. What do you mean by increasing the number of multi-level features? Adding more levels to the pyramid?
  3. Comparing Ego4D config with THUMOS config, the regression ranges overlap. Is there a specific reason to do so?

Thanks again!

tzzcl commented 1 year ago

For your follow-up questions:

  1. Somehow you can treat it is the duration of an event. If the regression range is [0,4] for one layer, then the actions with [0,8] second durations will be located in this layer. If the regression range is [8,16], then the actions with [16, 32] durations will be located in this layer.
  2. Yes. Just adding more levels to the pyramid.
  3. For this part, I think that overlapping regression ranges increases the positive example for ActionFormer. For more details, you can ask @fmu2.
happyharrycn commented 1 year ago

To add to Chen-Lin's comments, if you are looking for localizing actions of much longer duration, possible strategies include

zivnachum commented 1 year ago

Thanks! Much appreciated!