I used the csdarknet53-omega. cfg file to train the classification network(not detection) on my own dataset. When using label_smooth_eps =0.1, the loss increases gradually, no matter how much learning rate I set. But when not use label_smooth_eps=0.1, the loss gradually converges.I mean, why is that?
By the way, I have five categories with an average of 1,800 pictures in each category. How many training times and batch sizes should be set?
I used the csdarknet53-omega. cfg file to train the classification network(not detection) on my own dataset. When using label_smooth_eps =0.1, the loss increases gradually, no matter how much learning rate I set. But when not use label_smooth_eps=0.1, the loss gradually converges.I mean, why is that? By the way, I have five categories with an average of 1,800 pictures in each category. How many training times and batch sizes should be set?