loss clsoe to -1 at the begining of training

PatrickHua / SimSiam

A pytorch implementation for paper 'Exploring Simple Siamese Representation Learning'

MIT License

814 stars 135 forks source link

loss clsoe to -1 at the begining of training #6

Closed JudasDie closed 3 years ago

JudasDie commented 3 years ago

Has anyone met the problem that the loss close to -1 at the beginning of training? BTW, the training data is not sourced from traditional classification data like Imagenet or cifar.

PatrickHua commented 3 years ago

What kind of data? cifar/imagenet put target object in the middle and only one instance may appear for contrastive learning. The data augmentation (resized random crop etc.) is tailored to this type of input.

JudasDie commented 3 years ago

What kind of data? cifar/imagenet put target object in the middle and only one instance may appear for contrastive learning. The data augmentation (resized random crop etc.) is tailored to this type of input.

Similar to coco format, yet I crop the target from the original image. The target is centrally present in the image, as in Imagenet.

matthiasware commented 3 years ago

Has anyone met the problem that the loss close to -1 at the beginning of training? BTW, the training data is not sourced from traditional classification data like Imagenet or cifar.

Yes! Especially for parameters that differ from the ones given in the paper! Checkout Fig.2 in the paper and track the channel std!

JudasDie commented 3 years ago

Has anyone met the problem that the loss close to -1 at the beginning of training? BTW, the training data is not sourced from traditional classification data like Imagenet or cifar.

Yes! Especially for parameters that differ from the ones given in the paper! Checkout Fig.2 in the paper and track the channel std!

Hi，do you use Imagenet or custom data?

matthiasware commented 3 years ago

Has anyone met the problem that the loss close to -1 at the beginning of training? BTW, the training data is not sourced from traditional classification data like Imagenet or cifar.

Yes! Especially for parameters that differ from the ones given in the paper! Checkout Fig.2 in the paper and track the channel std!

Hi，do you use Imagenet or custom data?

custom dataset! but i can get good results by choosing the right parameters!

JudasDie commented 3 years ago

Has anyone met the problem that the loss close to -1 at the beginning of training? BTW, the training data is not sourced from traditional classification data like Imagenet or cifar.

Yes! Especially for parameters that differ from the ones given in the paper! Checkout Fig.2 in the paper and track the channel std!

Hi，do you use Imagenet or custom data?

custom dataset! but i can get good results by choosing the right parameters!

Could you share some experience? e.g., how to choose the parameter? or which parameter affects most?

PatrickHua commented 3 years ago

Has anyone met the problem that the loss close to -1 at the beginning of training? BTW, the training data is not sourced from traditional classification data like Imagenet or cifar.

Yes! Especially for parameters that differ from the ones given in the paper!

Interesting! So SimSiam is very sensitive to certain parameters right? What hyperparameter did you change ?

Checkout Fig.2 in the paper and track the channel std!

What's with the channel std?