Help with Sequence Score

quangdaist01 commented 2 years ago

Hello, I am interested in your work, and I want to replicate the reported result first before performing further experiments (for a class project). The metrics.py contains functions to compute sequence scores, but as mentioned in #3, some clustering work must be done first. I have read the Sequence Score algorithm, but I have no idea how to perform it. Can you provide some more materials on computing the metric? Thank you for reading!

ouyangzhibo commented 2 years ago

import numpy as np
from sklearn.cluster import MeanShift, estimate_bandwidth

def scanpath2clusters(meanshift, scanpath):
    string = []
    xs = scanpath['X']
    ys = scanpath['Y']
    for i in range(len(xs)):
        symbol = meanshift.predict([[xs[i], ys[i]]])[0]
        string.append(symbol)
    return string

def improved_rate(meanshift, scanpaths):
    Nc = len(meanshift.cluster_centers_)
    Nb, Nw = 0, 0
    for scanpath in scanpaths:
        string = scanpath2clusters(meanshift, scanpath)
        for i in range(len(string)-1):
            if string[i]==string[i+1]:
                Nw += 1
            else:
                Nb += 1
    return (Nb-Nw)/Nc

xs, ys = [], []
for scanpath in scanpaths:
    xs += list(scanpath['X'])
    ys += list(scanpath['Y'])

gt_gaze = np.concatenate((np.vstack(xs), np.vstack(ys)), axis=1)
bandwidth = estimate_bandwidth(gt_gaze)
rates = []
factors = [0.25, 0.5, 1.0, 1.5, 2.0, 3.0, 4.0]
for factor in factors:
    bd = bandwidth*factor
    ms = MeanShift(bandwidth=bd)
    ms.fit(gt_gaze)
    rate = improved_rate(ms, scanpaths)
    rates.append(rate)
rates = np.vstack(rates)

best_bd = factors[np.argmax(rates)]*bandwidth
best_ms = MeanShift(bandwidth=best_bd)
best_ms.fit(gt_gaze)

# save best_ms for evaluation
gt_strings = []
for gt_scanpath in scanpaths:
    gt_string = scanpath2clusters(best_ms, gt_scanpath)
    gt_strings.append(gt_string)

Sequence score with interaction rate: https://www.cv-foundation.org/openaccess/content_iccv_2013/papers/Borji_Analysis_of_Scores_2013_ICCV_paper.pdf Sequence score with improved interaction rate: https://www-users.cs.umn.edu/~qzhao/publications/pdf/jiang_tnnls16.pdf

In practice, I use the bandwidth b_estimated estimated by sklearn (can be found in example), then I try b=b_estimated*scale_i, scale_i = 0.2, 0.5, 0.8, 1.0, 1.2, 1.5, 1.8 and select the one with the highest improved interaction rate. Checkout the example: https://scikit-learn.org/stable/auto_examples/cluster/plot_mean_shift.html#sphx-glr-auto-examples-cluster-plot-mean-shift-py

Input: sequences of fixations on one image

You need some functions: a. all_fixations = scanpaths2fixations(sequences of fixations), just expand all fixations b. clusters = meanshift(all_fixations, bandwidth), can be found in the example c. strings = scanpaths2strings(sequences of fixations, clusters), can be found in the example d. score = strings2score(strings), you already know how to do this for one subject, just do it for all subjects
for b in b_estimated*scales: process functions b, c, d to get a score, select b with the highest score
save clusters with the selected bandwidth b* and ground truth strings so you can evaluate a new scanpath easily using function c and string comparison algorithms.

quangdaist01 commented 2 years ago

Thank you very much! Have a great day!

StoyanVenDimitrov commented 2 years ago

Hi, I try to verify the human oracle sequence score you provide in the paper (0.490), but got way higher score of 0.678. I was able to reproduce your MultiMtach score, so the problem should not be in the data I use. I do the following:

get all valid gt scanpaths for each image and compute the clusters and strings for these scanpaths with a method, very similar to your example above:


def compute_clusters(gt_scanpaths):
    xs, ys = [], []
    for scanpath in gt_scanpaths:
        xs += list(scanpath['X'])
        ys += list(scanpath['Y'])

    gt_gaze = np.concatenate((np.vstack(xs), np.vstack(ys)), axis=1)
    bandwidth = estimate_bandwidth(gt_gaze)
    rates = []
    factors = [0.2, 0.5, 0.8, 1.0, 1.2, 1.5, 1.8] #[0.25, 0.5, 1.0, 1.5, 2.0, 3.0, 4.0]
    for factor in factors:
        bd = bandwidth*factor if bandwidth > 0.0 else None 
        ms = MeanShift(bandwidth=bd)
        ms.fit(gt_gaze)
        rate = improved_rate(ms, gt_scanpaths)
        rates.append(rate)
    rates = np.vstack(rates)

    best_bd = factors[np.argmax(rates)]*bandwidth if bandwidth > 0.0 else None
    best_ms = MeanShift(bandwidth=best_bd)
    best_ms.fit(gt_gaze)

    gt_strings = []
    subjects = []
    for gt_scanpath in gt_scanpaths:
        gt_string = scanpath2clusters(best_ms, gt_scanpath)
        gt_strings.append(gt_string)
        subjects.append(gt_scanpath['subject'])

    return best_ms, gt_strings, subjects

do this for every image in the cocosearch18 test set and save best_ms (MeanShift), gt_strings (List), subjects (List) to a dict
apply the Sequence score method you provided with every gt scanpath and the computed clusters for this image, omitting the gt scanpath that has the same subject as the one currently used as prediction.

Could it be that you changed something after you published the paper and the 0.490 is not the one you got with the current code provided? Or am I doing something wrong with the clusters? Thank you!

ouyangzhibo commented 2 years ago

Is it because of the clusters you computed for sequence score? Can you verify it by using the provided clusters?