Closed Samjith888 closed 4 years ago
@Samjith888
The default hyperparameters of centermask are tuned to COCO dataset using 16 batch size on 8GPUs.
You have to find proper hyperparameters for your own dataset and own environment.
How many GPUs and batch size you use?
When you change batch size for training, you have to adjust base_lr.
I recommend how to adjust hyperparameters as below,
@Samjith888
The default hyperparameters of centermask are tuned to COCO dataset using 16 batch size on 8GPUs.
You have to find proper hyperparameters for your own dataset and own environment.
How many GPUs and batch size you use?
When you change batch size for training, you have to adjust base_lr.
I recommend how to adjust hyperparameters as below,
* Reducing bs_lr * increasing WARMUP_ITERS * increasing batch size ASAP * change a backbone to lightweight models
Dataset = 52000 images. One class . 3.5 lakh instances. Num_GPU =1 Then What batch size and base_lr should prefer ?
I don't know which settings are best for all environments.
But, I recommend using as many the batch size as your GPUs can handle.
I got nan values when used the default config in vovnet. Then i tried by reducing the bs_lr into 0.001 , 0.00025 .Hence the nan value issue solved, but the training loss not reducing (training loss starts from 1.9 to and reached in 0.7) , the AP is 11 for 75000 iterations.
Dataset : 57000 images with one class , those images are in different resolutions.