foolwood / DCFNet_pytorch

DCFNet: Discriminant Correlation Filters Network for Visual Tracking
https://arxiv.org/pdf/1704.04057.pdf
MIT License
207 stars 60 forks source link

When the network is converged #4

Closed BestSonny closed 6 years ago

BestSonny commented 6 years ago

@foolwood Hi Qiang, thank you for sharing the code. I want to know the typical loss value when the network is converged and iterations it takes.

foolwood commented 6 years ago

@BestSonny Thanks for your attention.

The typical loss value is around 5-8. I think the network is converged after 20 epochs. I keep training it for a fair comparison with SiamFC.

BestSonny commented 6 years ago

@foolwood By the way, for the code :

def gaussian_shaped_labels(sigma, sz):
    x, y = np.meshgrid(np.arange(1, sz[0]+1) - np.floor(float(sz[0]) / 2), np.arange(1, sz[1]+1) - np.floor(float(sz[1]) / 2))
    d = x ** 2 + y ** 2
    g = np.exp(-0.5 / (sigma ** 2) * d)
    g = np.roll(g, int(-np.floor(float(sz[0]) / 2.) + 1), axis=0)
    g = np.roll(g, int(-np.floor(float(sz[1]) / 2.) + 1), axis=1)
    return g.astype(np.float32)

Instead of using a centered Gaussian ground truth label, why do you intentionally shift it to four corners?

 g = np.roll(g, int(-np.floor(float(sz[0]) / 2.) + 1), axis=0)
 g = np.roll(g, int(-np.floor(float(sz[1]) / 2.) + 1), axis=1)

Does this give a better results?

foolwood commented 6 years ago

@BestSonny I think this two implementations are equal. I only follow the implementation in KCF PAMI. You can check the figure 6 in KCF PAMI. http://www.robots.ox.ac.uk/~joao/publications/henriques_tpami2015.pdf

BestSonny commented 6 years ago

@foolwood Thank you for the quick response. I also agree that they are equal. A centered version may be more intuitive and easier to use the subPixelPeak trick .