XFeiF / ComputerVision_PaperNotes

📚 Paper Notes (Computer vision)
1 stars 0 forks source link

20CVPR| A Multigrid Method for Efficiently Training Video Models #21

Closed XFeiF closed 3 years ago

XFeiF commented 3 years ago

Paper & Code

Authors: Chao-Yuan Wu1,2 Ross Girshick2 Kaiming He2 Christoph Feichtenhofer2 Philipp Krahenb 1
1The University of Texas at Austin 2Facebook AI Research (FAIR)

Problem to be tackled High resolution models perform well, but train slowly. Low resolution models train faster, but are less accurate.
Trade-off the balance between compution allocated to processing more examples per mini-batch vs. the computation allocated to processing larger time and space dimensions.

Core observation: The underlying sampling grid that is used to train video models need not be constant during training.

Highlight

To avoid this trade-off, this paper proposed to use variable mini-batch shapes with different spatial-temporal resolutions that are varied according to a schedule. Training is accelerated by scaling up the mini-batch size and learning rate when shrinking the other dimensions. This means with this strategy, we can have faster training without losing accuracy. different shapes: resampling the training data on multiple sampling grids.
sampling grids: it is specified by a temporal span, a spatial span, a temporal stride, and a spatial stride.

Methods

Baseline: a referebce video model (C3D, I3D) trained by a baseline mini-batch optimizer (SGD) that operates on mini-batches of shape BxTxHxW (mini-batch size x number of frames x height x width) for some number of epochs (e.g., 100).

This paper: consider temporal and spatial shapes t x w x h that are formed by resampling source videos with a new sampling grid that has its own spans and strides.

XFeiF commented 3 years ago

简单来说,这种multigrid方法是通过训练时将源视频应用不同的sampling grid来获得不同shape的mini-batch,从而使得mini-batch在训练过程中动态变化。这样做的好处是,训练时先使用大mini-batch(伴随相对小的time和space维度,即一个较粗的grid),后用小的mini-batch(相对大的time和space维度,即一个较为精细的grid),平均下来,SGD可以更快地扫描数据,最终得到一个高精确度的模型。