PaddlePaddle / PaddleVideo

Awesome video understanding toolkits based on PaddlePaddle. It supports video data annotation tools, lightweight RGB and skeleton based action recognition model, practical applications for video tagging and sport action detection.
Apache License 2.0
1.49k stars 375 forks source link

Call for suggestions #68

Closed chajchaj closed 6 months ago

chajchaj commented 3 years ago

This issue is for collecting suggestions. You can either: 1.Suggest a new feature by leaving a comment. 2.Vote for a feature request with +1 or be against with -1. 3.Tell us that you would like to help implement one of the features in the list.

chajchaj commented 3 years ago
  1. Optimize some video models , e.g. pp-TSM, by pruning, quantization, knowledge distillation, better backbone, to achieve a better balance between effectiveness(Top1) and efficiency(VPS).
  2. Release some tools for some video datasets, e.g. downloading Kinetics-400, Youtube-8M.
  3. Release some applications using video models, e.g. video tagging, Highlight snippet extraction, Security Monitoring, Video retrieval, Student behavior analysis in classroom and so on.
  4. Support X3D and Multi-Grid by FAIR
  5. Add some video models technology analysis docs, e.g. TSN, TSM, SlowFast, X3D, Multi-Grid and so on.
  6. Add more lightweight video models and Optimization Strategy.
  7. Support more backbones and some pretrained backbone models for video.
  8. Support ava dataset and spatio-temporal action detection models.
  9. Optimize Video decoding speed by DALI.
  10. Support Distributed training for video models .
  11. Add ECO: Efficient Convolutional Network for Online Video Understanding, ECCV 2018
  12. Add 3D Resnet: Would Mega-scale Datasets Further Enhance Spatiotemporal 3D CNNs, CVPR 2018
  13. Add TPN: Temporal Pyramid Network for Action Recognition, CVPR 2020
  14. Add EvaNet: Evolving Space-Time Neural Architectures for Videos, ICCV 2019
  15. Add RepFlow: Representation Flow for Action Recognition, CVPR 2019
  16. Add MARS: Motion-Augmented RGB Stream for Action Recognition, CVPR 2019
  17. Add StNet: Local and Global Spatial-Temporal Modeling for Human Action Recognition, AAAI 2019
  18. Add Attention Cluster: Purely Attention Based Local Feature Integration for Video Classification
  19. Add NeXtVLAD: An Efficient Neural Network to Aggregate Frame-level Features for Large-scale Video Classification
  20. Add C-TCN: Action localization Model by Baidu, the Champion model of ActivityNet 2018
linhandev commented 3 years ago

Maybe provide a readthedocs.io version ducomentation? It shall be easier to read and navigate than the current version which lives inside github repo

chajchaj commented 3 years ago

Maybe provide a readthedocs.io version ducomentation? It shall be easier to read and navigate than the current version which lives inside github repo

We have received your suggestion. We will evaluate the demand and determine R & D schedule as soon as possible.

Deep-learning999 commented 3 years ago

Can you provide a video prompt ipynb for detection and tracking + slowfast for the ava data set