An annotated dataset of YouTube videos designed as a benchmark for Fine-grained Incident Video Retrieval. The dataset comprises 225,960 videos associated with 4,687 Wikipedia events and 100 selected video queries.
Project Website: [link]
Paper: [publisher] [arXiv] [pdf]
git clone https://github.com/MKLab-ITI/FIVR-200K
cd FIVR-200K
pip install -r requirements.txt
or
conda install --file requirements.txt
The files that contains the dataset can be found in dataset folder
The video annotations are in file annotation.json that has the following format:
{
"5MBA_7vDhII": {
"ND": [
"_0uCw0B2AgM",
...],
"DS": [
"hc0XIE1aY0U",
...],
"CS": [
"ydEqiuDiuyc",
...],
"IS": [
"d_ZNjE7B4Wo",
...],
"DA": [
"rLvVYdtc73Q",
...],
},
....
}
The events crawled from Wikipedia's Current Event page are in file events.json that has the following format:
[
{
"headline": "iraqi insurgency",
"topic": "armed conflict and attack",
"date": "2013-01-22",
"text": [
"car bombings in baghdad kill at least 17 people and injure dozens of others."
],
"href": [
"http://www.bbc.co.uk/news/world-middle-east-21141242",
"https://www.reuters.com/article/2013/01/22/us-iraq-violence-idUSBRE90L0BQ20130122"
],
"youtube": [
"ZpjqUq-EnbQ",
...
]
},
...
]
The Youtube IDs of the videos in the dataset are in file youtube_ids.txt
The global features of the benchmarked approaches in the paper can be found here
Run the following command to download videos:
python download_dataset.py --video_dir VIDEO_DIR [--dataset_ids DATASET_FILE] [--cores NUMBER_OF_CODES] [--resolution RESOLUTION]
An example to run the download script:
python download_dataset.py --video_dir ./videos --dataset_ids dataset/youtube_ids.txt --cores 4 --resolution 360
Videos will be saved in the following directory structure VIDEO_DIR/YT_ID.mp4
The videos that are no longer available are stored in a text file with name missing_videos.txt
Generation of the result file
A file that contains a dictionary with keys the YT ids of the query videos and values another dictionary with keys the YT ids of the dataset videos and values their similarity to the query.
Results can be stored in a JSON file with the following format:
{
"wrC_Uqk3juY": {
"KQh6RCW_nAo": 0.716,
"0q82oQa3upE": 0.300,
...},
"k_NT43aJ_Jw": {
"-KuR8y1gjJQ": 1.0,
"Xb19O5Iur44": 0.417,
...},
....
}
An implementation for the generation of the JSON file can be found here
Evaluation of the results
Run the following command to run the evaluation:
python evaluation.py --result_file RESULT_FILE --relevant_labels RELEVANT_LABELS
An example to run the evaluation script:
python evaluation.py --result_file ./results/lbow_vgg.json --relevant_labels ND,DS
Add flag --help
to display the detailed description for the arguments of the evaluation script
Evaluation on the three retrieval task
relevant_labels
argument to evaluate your results for the three visual-based retrieval task
DSVR: ND,DS
CSVR: ND,DS,CS
ISVR: ND,DS,CS,IS
DA
to the relevant_labels
argumentReported results
To re-produce the results of the paper run the following command
bash evaluate_run.sh APPROACH_NAME FEATURES_NAME
An example to run the evaluation script:
bash evaluate_run.sh BOW VGG
The results will probably not be the same as the reported one in the paper, because we are constantly fixing mislabeled videos that were missed during the annotation process.
See in the Updates section when was the last update of the dataset's annotation
In case that you find a mislabeled video please submit it to the following form here
DA
label for audio-based annotations of duplicate audio videosIf you use FIVR-200K dataset for your research, please consider citing our paper:
@article{kordopatis2019fivr,
title={{FIVR}: Fine-grained Incident Video Retrieval},
author={Kordopatis-Zilos, Giorgos and Papadopoulos, Symeon and Patras, Ioannis and Kompatsiaris, Ioannis},
journal={IEEE Transactions on Multimedia},
year={2019}
}
If you use the audio-based annotations, please also consider citing our paper:
@inproceedings{avgoustinakis2020ausil,
title={Audio-based Near-Duplicate Video Retrieval with Audio Similarity Learning},
author={Avgoustinakis, Pavlos and Kordopatis-Zilos, Giorgos and Papadopoulos, Symeon and Symeonidis, Andreas L and Kompatsiaris, Ioannis},
booktitle={Proceedings of the IEEE International Conference on Pattern Recognition},
year={2020}
}
Intermediate-CNN-Features - this repo was used to extract our CNN features
NDVR-DML - one of the methods benchmarked in the FIVR-200K dataset
ViSiL - video similarity learning for fine-grained similarity calculation
AuSiL - audio similarity learning for audio-based similarity calculation
This project is licensed under the Apache License 2.0 - see the LICENSE file for details
Giorgos Kordopatis-Zilos (georgekordopatis@iti.gr)
Symeon Papadopoulos (papadop@iti.gr)