marcellacornia / sam

Predicting Human Eye Fixations via an LSTM-based Saliency Attentive Model. IEEE Transactions on Image Processing (2018)
https://ieeexplore.ieee.org/document/8400593
MIT License
211 stars 76 forks source link

Loading Salicon dataset #9

Closed spandanagella closed 6 years ago

spandanagella commented 6 years ago

Hi,

I'm trying to retrain the models on salicaon using your code. I don't see the fixations and fixation maps data as used in the code here. Like separate files for each image. Or is there a preprocessing part that I'm supposed to do before using the code?

Can you give me pointers to the dataset download url?

I'm currently downloading the dataset from http://salicon.net/download/

Thank you so much! Spandana

marcellacornia commented 6 years ago

Hi @spandanagella, thanks for downloading our code.

You can download the ground-truth density maps and fixation maps from this page: http://salicon.net/challenge-2017/. If you want to replicate our results, you have to use the original release of the SALICON dataset.

Mastya commented 6 years ago

Hi @marcellacornia, I downloaded your code and I'm trying to train model on SALICON Dataset. However I faced with few issues:

  1. Here:https://github.com/marcellacornia/sam/blob/bba23cdc3eb921563d23a4339cd774a45b7b903c/utilities.py#L106 you referencing to key "I" in fixation map, but there is no such a key in SALICON fixation maps. There are only these keys:
    fix_map.keys()
    Out[68]: dict_keys(['__header__', '__version__', '__globals__', 'image', 'resolution', 'gaze'])
  2. I supposed that 'I' equivalent to 'gaze'. But on the next line:https://github.com/marcellacornia/sam/blob/bba23cdc3eb921563d23a4339cd774a45b7b903c/utilities.py#L107 fix_map is passed to padding_fixation, where it is handled like an image (padding addition and). But 'gaze' is nested ndarray of shape (54, 1), where each internal array consists of 3 arrays with different shapes. So my question is: what do I need to pass to padding_fixation as the fix_map, cause I can't find the appropriate data?
marcellacornia commented 6 years ago

Hi @Mastya, are you using the original release of the SALICON dataset (that is reported at the end of this page)?

SALICON authors changed the data format few months ago, but my code only supports the original data format.

Mastya commented 6 years ago

Yes, @marcellacornia, I checked all datasets from this page, including previous release of SALICON (Matlab files and saliency maps, used in ’15 and ’16 challenges). Data structure is the same as I described before, and it's not correspond with the data processing in the code. Can you, please, tell me, where I can get the original data, that you used for training? Or how can I convert the existing data to suitable format?

marcellacornia commented 6 years ago

Please try by changing the preprocess_fixmaps function as follows:

def preprocess_fixmaps(paths, shape_r, shape_c):
    ims = np.zeros((len(paths), 1, shape_r, shape_c))

    for i, path in enumerate(paths):
        gazes = scipy.io.loadmat(path)["gaze"]
        coords = []
        for gaze in gazes:
            coords.extend(gaze[0][2])
        for coord in coords:
            if coord[1] >= 0 and coord[1] < shape_r and coord[0] >= 0 and coord[0] < shape_c:
                ims[i, 0, coord[1], coord[0]] = 1.0

    return ims
spandanagella commented 6 years ago

Hi @Mastya,

Were you able to train the models with the above preprocessing code?

Spandana

Mastya commented 6 years ago

Hi, @spandanagella, Yes, I tried it. It's working, but the result is weird. I mean that nearly on every epoch I got negative losses. I tested @marcellacornia's model and the one that I finally got after training on the same testset and the results are not even looking similar. I'm continuing my research. If you have any useful ideas let's share it.

SenJia commented 6 years ago

Hi @marcellacornia, I really like your work "Predicting Human Eye Fixations via an LSTM-based Saliency Attentive Model". Can I ask a question that if your precomputed saliency map by SAM-RES on the SALICON validation set is the same one that you used to report result in TABLE IV in the paper. Many thanks Sen

marcellacornia commented 6 years ago

Hi @SenJia, thank you.

The results in Table IV were obtained by using the output of the Attentive ConvLSTM at different timesteps as input for the rest of the model. The results are on the SALICON validation set, using the 2015 version of the dataset.

In Table IV, the pre-computed saliency maps we released were used to compute the results with T=4.

eleboss commented 6 years ago

@prachees Me too, the loss start from a nan number.

It looks like loss: nan - lambda_2_loss:nan Is it normal?