Mr. HiSum: A Large-scale Dataset for Video Highlight Detection And Summarization

Mr. HiSum is a large-scale video highlight detection and summarization dataset, which contains 31,892 videos selected from YouTube-8M dataset and reliable frame importance score labels aggregated from 50,000+ users per video.

Most Replayed Statistics for Summarization

Example 1: AC Sparta Praha - Top 10 goals, season 2013/2014

1	2	3	4

The four most viewed scenes in the "AC Sparta Praha" video (Link) all show players scoring goals.

Example 2: Best Bicyle Kick Goals in Peru

1	2	3	4

The four most viewed scenes in the above video all show players scoring goals with amazing bicycle kicks.(Link)

Example 3: Neo - 'The One' | The Matrix

1	2	3

In the first most viewed scene, noted as 1 in the video, as soon as Neo meets Agent Smith, he is immediately shot by a gun. The second most viewed scene, noted as 2, plenty of Agent Smiths shoots Neo and Neo reaches out his hand to block the bullets. Lastly, in the most viewed scene 3, Neo engages in combat with Agent Smith. (Link)

Update

2023.09.19, Repository created.

Getting Started

Download the YouTube-8M dataset and place it under your dataset path. For example, when your dataset path is /data/dataset/, place your yt8m folder under the dataset path.
Download mr_hisum.h5 and metadata.csv and place it under the dataset folder.

Create a virtual environment using the following command:

conda env create -f environment.yml
conda activate mrhisum

Complete Mr.HiSum Dataset

You need four fields on your mr_hisum.h5 to prepare.

features: Video frame features from the YouTube-8M dataset.
gtscore: The Most replayed statistics normalized to a score of 0 to 1.
change_points: Shot boundary information obtained using the Kernel Temporal Segmentation algorithm.
gtsummary: Ground truth summary obtained by solving the 0/1 knapsack algorithm on shots.

We provide three fields, gtscore, change_points, and gtsummary, inside mr_hisum.h5.

After downloading the YouTube-8M dataset, you can add the features field using

python preprocess/preprocess.py --dataset_path <your_dataset_path>/yt8m

For example, when your dataset path is /data/dataset/, follow the command below.

python preprocess/preprocess.py --dataset_path /data/dataset/yt8m

Please read DATASET.md for more details about Mr.HiSum.

Baseline models on Mr.HiSum

We provide compatible code for three baselines models, PGL-SUM, VASNet, and SL-module.

You can train baseline models on Mr.HiSum from scratch using the following commands.

PGL-SUM

python main.py --train True --model PGL_SUM --batch_size 256 --epochs 200 --tag train_scratch

VASNet

python main.py --train True --model VASNet --batch_size 256 --epochs 200 --tag train_scratch

SL-module

python main.py --train True --model SL_module --batch_size 256 --lr 0.05 --epochs 200 --tag train_scratch

Furthermore, we provide trained checkpoints of each model for reproducibility.

Follow the command below to run inference on trained checkpoints.

python main.py --train False --model <model_type> --ckpt_path <checkpoint file path> --tag inference

For example, if you download the VASNet checkpoint and place it inside the dataset folder, you can use the command as follows.

python main.py --train False --model VASNet --ckpt_path dataset/vasnet1_best_f1.pkl --tag vasnet_inference

Train your summarization model on Mr.HiSum

We provide a sample code for training and evaluating summarization models on Mr.HiSum.

Summarization model developers can test their own model by implementing pytorch models under the networks folder.

We provide the SimpleMLP summarization model as a toy example.

You can train your model on Mr.HiSum dataset using the command below. Modify or add new configurations with your taste!

python main.py --train True --batch_size 8 --epochs 50 --tag exp1

License of Assets

This dataset is licensed under Creative Commons Attribution 4.0 International (CC BY 4.0) license following the YouTube-8M dataset. All the Mr.HiSum dataset users must comply with the YouTube Terms of Service and YouTube API Services Terms of Service.

This code referred to PGL-SUM, VASNet, and SL-module. Every part of the code from the original repository follows the corresponding license. Our license of the code can be found in LICENSE.

Citation

If you find this work useful in your research, please consider citing our paper:


@article{sul2024mr,
  title={Mr. HiSum: a large-scale dataset for video highlight detection and summarization},
  author={Sul, Jinhwan and Han, Jihoon and Lee, Joonseok},
  journal={Advances in Neural Information Processing Systems},
  volume={36},
  year={2024}
}

MRHiSum / MR.HiSum

readme