KichangKim / DeepDanbooru

AI based multi-label girl image classification system, implemented by using TensorFlow.
MIT License
2.58k stars 258 forks source link

Best learnig rate #89

Closed destlaver closed 1 year ago

destlaver commented 1 year ago

so I made some experiments changing the lr and I found out something strange. The trainings seems to improve when the lr is as low as 1e-7 The loss decreases steadly and the sample images are more consistent. Is it normal? Am I doing something wrong?

setup: gpu rtx 4090 48vcpu, 124 gb ram batch size: 10 lr_scheduler: cosine lr: 1.2e-7 dataset size: 5k captions type: tags

Also I'm about to train with an A100, 50k training set, and I'm not sure about the batch size and how can i finetune hyperparameters to take advantage of the entire gpu.