Best learnig rate - Githubissues

so I made some experiments changing the lr and I found out something strange. The trainings seems to improve when the lr is as low as 1e-7 The loss decreases steadly and the sample images are more consistent. Is it normal? Am I doing something wrong?

setup: gpu rtx 4090 48vcpu, 124 gb ram batch size: 10 lr_scheduler: cosine lr: 1.2e-7 dataset size: 5k captions type: tags

Also I'm about to train with an A100, 50k training set, and I'm not sure about the batch size and how can i finetune hyperparameters to take advantage of the entire gpu.

KichangKim / DeepDanbooru

Best learnig rate #89