How are ground truth saliency maps generated from recorded fixations?

wenguanwang / DHF1K

Revisiting Video Saliency: A Large-scale Benchmark and a New Model (CVPR18, PAMI19)

138 stars 28 forks source link

How are ground truth saliency maps generated from recorded fixations? #2

Closed wjakobw closed 6 years ago

wjakobw commented 6 years ago

In your data collection you gather a set of discrete fixation maps (P in the paper). From this, continuous saliency maps (Q in the paper) are generated. I found no details about how this is done, could you elaborate? I would guess that it involves gaussians centered on the spot of fixation, I am interested in the exact parameters, how you combine fixations from different test subjects and so on.

Thanks again for providing the dataset!

wenguanwang commented 6 years ago

@wjakobw

I use the code from https://github.com/remega/video_database/blob/master/make_gauss_masks2.m to blur the fixation map.

wjakobw commented 6 years ago

Thank you. I assume this means you used the same parameters as in that code, W=30? No extra modifications according to the geometries of your experimental setup?

For future reference if others are thinking about this:

In the meantime I wanted to generate some more ground truth frames in the same format as yours, so I fitted a Gaussian to a few dots in your saliency frames to determine the parameters. Using this example for fitting I found that the Gaussian width was 12.9 pixels, then I defined a new Gaussian function using code in the same example to blur my own fixation maps with the same parameters. (But upscaled Gaussian because I needed higher resolution saliency maps). The resulting saliency maps looked like the DHF1K maps.

kylemin commented 6 years ago

@wenguanwang Did you use the same code to blur the fixation map of Hollywood2 or UCFSports dataset?

wenguanwang commented 6 years ago

@kylemin

For Hollywood2 and UCF, I directly use following function to blur the fixation map:

densityMap = imfilter(fixations,fspecial('gaussian',150,20),'replicate').

wenguanwang commented 6 years ago

@wjakobw

I use this function for blurring the fixations in DHF1K. [x,y]=find(fixations); densityMap= make_gauss_masks(y,x,[video_res_y,video_res_x]); make_gauss_masks.zip

wenguanwang commented 6 years ago

@wjakobw @kylemin

I'm trying to zip the Hollywood-2 and UCF. Please give me some time, they will be uploaded later.

kylemin commented 6 years ago

Thank you, I appreciate it.

wenguanwang commented 6 years ago

@wjakobw @kylemin

Hi, all, the data of Hollywood-2 and UCF have been uploaded.

The code (ACLNet) and dataset (DHF1K with raw gaze records, UCF-sports are new added!) can be downloaded from:

Google disk：https://drive.google.com/open?id=1sW0tf9RQMO4RR7SyKhU8Kmbm4jwkFGpQ

Baidu pan: https://pan.baidu.com/s/110NIlwRIiEOTyqRwYdDnVg

The Hollywood-2 (74.6G) can be downloaded from:

Google disk：https://drive.google.com/open?id=1vfRKJloNSIczYEOVjB4zMK8r0k4VJuWk

wjakobw commented 6 years ago

Thank you for the processed datasets, it's very helpful. The Hollywood-2 link requires extra permissions from you to download. I sent an access request via Google drive but perhaps you want to make it public like the other links.

As a friendly suggestion for others downloading such large files from Google drive or similar onto a remote server, this Firefox extension was very helpful for me.

wenguanwang commented 6 years ago

@wjakobw thanks for your remind. It is public accessable now.

https://drive.google.com/file/d/1vfRKJloNSIczYEOVjB4zMK8r0k4VJuWk/view?usp=sharing

kylemin commented 6 years ago

@wenguanwang Thank you!