Loss function is not convergent when batch-sizes smaller？

HobbitLong / SupContrast

PyTorch implementation of "Supervised Contrastive Learning" (and SimCLR incidentally)

BSD 2-Clause "Simplified" License

3.12k stars 537 forks source link

Loss function is not convergent when batch-sizes smaller？ #127

Open kyre-99 opened 1 year ago

kyre-99 commented 1 year ago

Hello and thank you for your work. I have some small questions about training. Because I only have one GPU, model 2080ti, the display inch is only 8g, can not run the parameters you set, so I can only adjust the batch-size, but I found that S ran 100 epochs on Cifar-10 loss and no signs of convergence, verification is only 30~% accuracy, I want to know why? Simple AlexNet can also reach 80% acc using CELoss on several epochs。 Is it my Learning rate set wrong? I haven't changed according to you offer 0.5. If so, how much should I set when the batch size is smaller? Or is my epochs not enough? Thank you for your answer!

gonzaq94 commented 1 year ago

Hello! I'm facing the same issue. Did you manage to solve it? Thanks! Gonzalo

ZifengLiu98 commented 1 year ago

Hello and thank you for your work. I have some small questions about training. Because I only have one GPU, model 2080ti, the display inch is only 8g, can not run the parameters you set, so I can only adjust the batch-size, but I found that S ran 100 epochs on Cifar-10 loss and no signs of convergence, verification is only 30~% accuracy, I want to know why? Simple AlexNet can also reach 80% acc using CELoss on several epochs。 Is it my Learning rate set wrong? I haven't changed according to you offer 0.5. If so, how much should I set when the batch size is smaller? Or is my epochs not enough? Thank you for your answer!

Hi! Have u solved it? I am working on this project and find the same issue. Could u please give me some advice on it? I have one GPU, model 3090 with 24G display inch.

gonzaq94 commented 1 year ago

Still struggling to make it converge with a smaller batch size...

ZifengLiu98 commented 1 year ago

Hi! Have u solved it? I am working on this project and find the same issue. Could u please give me some advice on it? I have one GPU, model 3090 with 24G display inch.

Still struggling to make it converge with a smaller batch size...

https://github.com/Junya-Chen/FlatCLR#flatnce-a-novel-contrastive-representation-learning-objective Maybe the methods in this article can help, I'll give it a try.