I used 32 A100 gpu, batch size per gpu is 6, learing rate is 12e-4, learning rate of backbone is 12e-5, other settings are same.How can i get the correct result? Any helpful advise is appreciated!
I suggest you do experiments with the same setting at first. And change the settings only when you get the desired results. Or there may be some environment related problems to solve.
I used 32 A100 gpu, batch size per gpu is 6, learing rate is 12e-4, learning rate of backbone is 12e-5, other settings are same.How can i get the correct result? Any helpful advise is appreciated!