mever-team / distill-and-select

Authors official PyTorch implementation of the "DnS: Distill-and-Select for Efficient and Accurate Video Indexing and Retrieval" [IJCV 2022]
Apache License 2.0
65 stars 9 forks source link
duplicate-videos fivr knowledge-distillation ndvr near-duplicate-video-retrieval video-retrieval video-search video-similarity-learning video-similarity-search

DnS: Distill-and-Select for Efficient and Accurate Video Indexing and Retrieval

This repository contains the PyTorch implementation of the paper DnS: Distill-and-Select for Efficient and Accurate Video Indexing and Retrieval. It provides code for the knowledge distillation training of coarse- and fine-grained student networks based on similarities calculated from a teacher and the selector network. Also, the scripts for the training of the selector network are included. Finally, to facilitate the reproduction of the paper's results, the evaluation code, the extracted features for the employed video datasets, and pre-trained networks for the various students and selectors are available.

Prerequisites

Preparation

Installation

Feature files

Distillation

We provide the code for training and evaluation of our student models.

Student training

For fine-grained attention students:

python train_student.py --student_type fine-grained --binarization false --attention true --experiment_path /path/to/experiment/ --trainset_hdf5 /path/to/dns_100k.hdf5

For fine-grained binarization students:

python train_student.py --student_type fine-grained --binarization true --attention false --experiment_path /path/to/experiment/ --trainset_hdf5 /path/to/dns_100k.hdf5

Selection

We also provide the code for training of the selector network and the evaluation of our overall DnS framework.

Selector training

DnS Evaluation

Use our pretrained models

We also provide our pretrained models trained with the fg_att_student_iter2 teacher.

The feature extraction network used in out experiments

feature_extractor = FeatureExtractor(dims=512).eval()

Our Fine-grained Students

fg_att_student = FineGrainedStudent(pretrained=True, attention=True).eval() fg_bin_student = FineGrainedStudent(pretrained=True, binarization=True).eval()

Our Coarse-grained Students

cg_student = CoarseGrainedStudent(pretrained=True).eval()

Our Selector Networks

selector_att = SelectorNetwork(pretrained=True, attention=True).eval() selector_bin = SelectorNetwork(pretrained=True, binarization=True).eval()


* First, extract video features by providing a video tensor to feature extractor (similar as [here](https://github.com/MKLab-ITI/visil/tree/pytorch#use-visil-in-your-python-code))
```python
video_features = feature_extractor(video_tensor)

Citation

If you use this code for your research, please consider citing our papers:

@article{kordopatis2022dns,
  title={{DnS}: {Distill-and-Select} for Efficient and Accurate Video Indexing and Retrieval},
  author={Kordopatis-Zilos, Giorgos and Tzelepis, Christos and Papadopoulos, Symeon and Kompatsiaris, Ioannis and Patras, Ioannis},
  journal={International Journal of Computer Vision},
  year={2022}
}

@inproceedings{kordopatis2019visil,
  title={{ViSiL}: Fine-grained Spatio-Temporal Video Similarity Learning},
    author={Kordopatis-Zilos, Giorgos and Papadopoulos, Symeon and Patras, Ioannis and Kompatsiaris, Ioannis},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  year={2019}
}

Related Projects

ViSiL - here you can find our teacher model

FIVR-200K - download our FIVR-200K dataset

Acknowledgements

This work has been supported by the projects WeVerify and MediaVerse, partially funded by the European Commission under contract number 825297 and 957252, respectively, and DECSTER funded by EPSRC under contract number EP/R025290/1.

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details

Contact for further details about the project

Giorgos Kordopatis-Zilos (georgekordopatis@iti.gr)