CVPR2016视频分类相关的文章

y-wan commented 7 years ago

先占个坑。CVPR2016接收文章643篇，标题中含“video”的小于59篇（个别文章同一个标题里多次出现“video”一词），下面按CVPR2016接收文章中的顺序逐一整理。

5月14日-5月19日工作进度：

已完成：目前初步整理出视频/动作识别相关的文章57篇，并标记了与我们比赛可能相关的文章共10篇。
下一步：对于相关性强的10篇文章，每通读一篇后新开一个issue，分别介绍其思路、模型和源码等信息。

NiyunZhou commented 7 years ago

@y-wan 建议每篇文章新开一个issue，这样每篇文章的comments就能用来讨论这篇文章。不打算尝试的文章也可以通过 close issue进行整理。

y-wan commented 7 years ago

@NiyunZhou 好的，这个issue我留下来收集我认为不相关但标题含“video”的文章如何？

NiyunZhou commented 7 years ago

@y-wan 好啊，到时候还能回来找找有没有漏掉什么的

haozheji commented 7 years ago

感觉video classification相关的文章不一定只出现在近年，Google到一篇cvpr2014的文章，相关度挺高： Large-scale Video Classification with Convolutional Neural Networks 也可以直接上google搜。

y-wan commented 7 years ago

@cdjhz 好的，我刚看了几篇CVPR2016感觉和我们相关的比例比想象要小，整理得应该比较快，我按年份从最近到以前逐年整理

y-wan commented 7 years ago

这里用来统一记录通读过的CVPR2016视频相关文章与主题，个人认为与我们比赛相关的文章（7、9、15、18、26、27、33、34、37、47）已加粗对应条目。

Anticipating Visual Representations From Unlabeled Video: anticipating actions and objects in videos
Coherent Parametric Contours for Interactive Video Object Segmentation: video object segmentation
Primary Object Segmentation in Videos via Alternate Convex Optimization of Foreground and Background Distributions: video object segmentation
Automatic Fence Segmentation in Videos of Dynamic Scenes: video object segmentation
Discovering the Physical Parts of an Articulated Object Class From Multiple Videos: video object segmentation (inner region segmentation of objects)
A Benchmark Dataset and Evaluation Methodology for Video Object Segmentation: video object segmentation
Learning Temporal Regularity in Video Sequences: detection of regularities in videos (understanding videos)（异常帧检测，主题不符）
Bilateral Space Video Segmentation: video object segmentation
Object Detection From Video Tubelets With Convolutional Neural Networks: object detection from video (VID)（object detection，主题不符）
You Lead, We Exceed: Labor-Free Video Concept Learning by Jointly Exploiting Web Videos and Images: video concept learning
Track and Segment: An Iterative Unsupervised Approach for Video Object Proposals: video object segmentation
Highlight Detection With Pairwise Deep Ranking for First-Person Video Summarization: video highlight detection & video summarization
Video2GIF: Automatic Generation of Animated GIFs From Video: video highlight detection & video summarization
Hierarchical Recurrent Neural Encoder for Video Representation With Application to Captioning: video captioning & video temporal structure
From Keyframes to Key Objects: Video Summarization by Representative Object Proposal Selection: video summarization by proposing representing objects（没有源码，且主题相关不大）
Temporal Action Localization in Untrimmed Videos via Multi-Stage CNNs: video action localization & video summarization
Summary Transfer: Exemplar-Based Subset Selection for Video Summarization: video summarization
POD: Discovering Primary Objects in Videos Based on Evolutionary Refinement of Object Recurrence, Background, and Primary Object Models: video highlight detection by proposing primary objects（primary object detection，主题相关不大）
What If We Do Not Have Multiple Videos of the Same Action? -- Video Action Localization Using Web Images: video action localization
Recurrent Convolutional Network for Video-Based Person Re-Identification: video-based person re-identification
Top-Push Video-Based Person Re-Identification: video-based person re-identification
A Hole Filling Approach Based on Background Reconstruction for View Synthesis in 3D Video: background reconstruction
Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network: image and video super-resolution
Cascaded Interactional Targeting Network for Egocentric Video Analysis: egocentric action recognition in videos
Fast Temporal Activity Proposals for Efficient Detection of Human Actions in Untrimmed Videos: action recognition (recovers temporal segments containing actions in untrimmed videos)
Discriminative Hierarchical Rank Pooling for Activity Recognition: video activity recognition & video representation（注意到一篇标题没有“video”但可能相关的，然而没有源码……）
Convolutional Two-Stream Network Fusion for Video Action Recognition: video activity recognition（发表时为state-of-the-art，有MATLAB源码）
Walk and Learn: Facial Attribute Representation Learning From Egocentric Video and Contextual Data: egocentric video representation with the help of contexual data
Face2Face: Real-Time Face Capture and Reenactment of RGB Videos: face detection & facial reenactment
Self-Adaptive Matrix Completion for Heart Rate Estimation From Face Videos Under Realistic Conditions: heart rate estimation via face videos
Automating Carotid Intima-Media Thickness Video Interpretation With Convolutional Neural Networks: (another paper aimed at disease analysis)
Recognizing Micro-Actions and Reactions From Paired Egocentric Videos: people action recognition via paired egocentric videos
End-To-End Learning of Action Detection From Frame Glimpses in Videos: video action localization & video action recognition（Lua源码）
Action Recognition in Video Using Sparse Coding and Relative Features: video summarization, video action recognition & video classification（没找到源码）
Detecting Events and Key Actors in Multi-Person Videos: multi-person event classification and detection (generalizable to any multi-person setting)
Personalizing Human Video Pose Estimation: video pose estimation
Harnessing Object and Scene Semantics for Large-Scale Video Understanding: large-scale action recognition and video categorization（没找到源码）
Video-Story Composition via Plot Analysis: video composition from multiple video clips
Feature Space Optimization for Semantic Video Segmentation: feature optimization for video object segmentation
Track and Transfer: Watching Videos to Simulate Strong Human Supervision for Weakly-Supervised Object Detection: image recognition with region transfer from videos
Instance-Level Video Segmentation From Object Tracks: video object segmentation
Amplitude Modulated Video Camera - Light Separation in Dynamic Scenes: (irrelevant)
Panoramic Stereo Videos With a Single Camera: (as title)
Recognizing Car Fluents From Video: (focused on cars)
Inferring Forces and Learning Human Utilities From Videos: (irrelevant)
Force From Motion: Decoding Physical Sensation in a First Person Video: (irrelevant)
Slow and Steady Feature Analysis: Higher Order Temporal Coherence in Video: object recognition, scene classification & action recognition（还是没有源码）
Video Segmentation via Object Flow: video object segmentation
An Egocentric Look at Video Photographer Identity: (irrelevant)
Unsupervised Learning From Narrated Instruction Videos: learning main steps of tasks from instruction videos
Video Paragraph Captioning Using Hierarchical Recurrent Neural Networks: video captioning (interesting; no source code provided)
Jointly Modeling Embedding and Translation to Bridge Video and Language: video captioning & visual interpretation by language
Sparseness Meets Deepness: 3D Human Pose Estimation From Monocular Video: (irrelevant)
MSR-VTT: A Large Video Description Dataset for Bridging Video and Language: proposal of a dataset for video captioning
LOMo: Latent Ordinal Model for Facial Analysis in Videos: facial analysis in videos
Slicing Convolutional Neural Network for Crowd Video Understanding: crowd video understanding

补充若干CVPR2016中可能对我们有帮助的文章（前四篇借鉴价值不大或没有源码，最后一篇数学的东西太多了感觉要移植过来比较费时费力）：

Dynamic Image Networks for Action Recognition: action recognition & video representation; produces a single RGB dynamic image per video (source code)
Temporal Epipolar Regions
Temporal Action Localization With Pyramid of Score Distribution Features
Temporal Action Detection Using a Statistical Language Model
Efficient Temporal Sequence Comparison and Classification Using Gram Matrix Embeddings on a Riemannian Manifold

NiyunZhou commented 7 years ago

@cdjhz

感觉video classification相关的文章不一定只出现在近年，Google到一篇cvpr2014的文章，相关度挺高： Large-scale Video Classification with Convolutional Neural Networks 也可以直接上google搜。

我认为年份还是挺重要的，2年的时间，很多东西都不一样了。14年的模型在现在估计已经不是最优的了。我们把最优的都试了应该就差不多了。

NiyunZhou / The21-dayExpendables

CVPR2016视频分类相关的文章 #5