This repository provides the Pytorch implementation of TranSalNet: Towards perceptually relevant visual saliency prediction published in the Neurocomputing paper.
Overview:
TranSalNet has been implemented in two variants: TranSalNet_Res with the CNN backbone of ResNet-50 and TranSalNet_Dense with the CNN backbone of DenseNet-161.
Pre-trained models on SALICON training set for the above two variants can be download at:
It is also necessary to download ResNet-50 (for TranSalNet_Res) and DenseNet-161 (TranSalNet_Dense) pre-trained models on ImageNet. These models can be download at:
The pre-trained models should be downloaded and put in the folder named pretrained_models
in the code folder first, then the following example codes can be used smoothly.
We have prepared two Jupyter Notebook files (.ipynb) for usage of TranSalNet.
testing.ipynb
. It can be used to compute and obtain the visual saliency maps of input images.testing
, and the models are loaded with parameters pre-trained on the SALCON training set. training&fine-tuning.ipynb
Data prepare for fine-tuning and training:
│ dataset/
├── train_ids.csv
├── val_ids.csv
├── train/
│ ├── train_stimuli/
│ │ ├── ......
│ ├── train_saliency/
│ │ ├── ......
│ ├── train_fixation/
│ │ ├── ......
├── val/
│ ├── val_stimuli/
│ │ ├── ......
│ ├── val_saliency/
│ │ ├── ......
│ ├── val_fixation/
│ │ ├── ......
In the above two .ipynb files, it is possible to choose whether TranSalNet_Res or TranSalNet_Dense is used, depending on the needs and preferences.
Please note: The spatial size of inputs should be 384×288 (width×height).
If this work is helpful, please consider citing:
@article{TranSalNet,
title = {TranSalNet: Towards perceptually relevant visual saliency prediction},
journal = {Neurocomputing},
year = {2022},
issn = {0925-2312},
doi = {https://doi.org/10.1016/j.neucom.2022.04.080},
author = {Jianxun Lou and Hanhe Lin and David Marshall and Dietmar Saupe and Hantao Liu},
}