srkds / CHAN-QFVS-PyTorch-Implementation

PyTorch implementation for Convolutional Hierarchical Attention Network for Query-Focused Video Summarization paper, accepted by AAAI 2020 conference.
2 stars 1 forks source link
attention-mechanism cnn computer-vision deep-learning pytorch pytorch-implementation video-processing video-summarization videosummarization

Convolutional Hierarchical Attention Network for Query-Focused Video Summarization (CHAN): A PyTorch Implementation

This is a PyTorch implementation of the "Convolutional Hierarchical Attention Network for Query-Focused Video Summarization", which is accepted by AAAI 2020 conference.

Note: This project is stil a work in progress

πŸŽ₯ Model Details

Parallel Computing Model Simple Model Diagram

πŸ“‘ Dataset

πŸ“ˆ Loss Function and Evaluation Method

πŸ“Š Results

Here is the result video summary for the query FOOD and HANDS. The model generated a ~4:30 minute summary which contains clips that either have food or hands in frame from a ~4-hour long video which contains diverse scenes like library, mall, driving, shop, etc.

https://github.com/srkds/CHAN-QFVS-PyTorch-Implementation/assets/61644078/5ed127f7-06fe-4d91-85e7-9626ebc38b6c

Installation

Step 1: Install dependencies

pip install -r requirements.txt

Step 2: Run the model

python main.py

Model Settings and Experiment Details

Todo

πŸ™ Acknowledgement

The implementation and understanding of this paper is being done as part of my research progress under the guidance of Prof. Payal Prajapati.

The evaluation code is being borrowed from EgoVLPv2.

The code is inspired by CHAN implementation: https://github.com/ckczzj/CHAN