training does not converge when using dense bounding boxes dataset

Hi Zeng,

Thank you for your good work.

I was able to train the MSRA-TD500 dataset and repeat the results using your code. Wonderful work indeed!

But when I trained using multiple GPUs with my personal dataset that has more dense bounding boxes and bigger images, the training did not converge for some reason.

Can you please share some ideas?

Thank you in advance!

The configuration is attached below:

==========Options============ means: [0.485, 0.456, 0.406] stds: [0.229, 0.224, 0.225] gpu: 1 max_epoch: 400 start_epoch: 0 cuda: True output_dir: output input_size: 1376 max_annotation: 64 adj_num: 4 num_points: 20 use_hard: True load_memory: True scale: 1 grad_clip: 25 pos: False dis_threshold: 0.35 cls_threshold: 0.875 approx_factor: 0.004 know: False exp_name: TD500HUST_mid_convert resume: None num_workers: 21 mgpu: True save_dir: ./model/ vis_dir: ./vis/ log_dir: ./logs/ loss: CrossEntropyLoss pretrain: False verbose: True viz: False lr: 0.001 lr_adjust: fix stepvalues: [] weight_decay: 0.0 gamma: 0.1 momentum: 0.9 batch_size: 4 optim: Adam save_freq: 1 display_freq: 10 viz_freq: 50 log_freq: 10000 val_freq: 1000 net: FSNet_M mid: False embed: False onlybackbone: False rescale: 255.0 test_size: [640, 960] checkepoch: 1070 img_root: None device: cuda =============End============= MixNet backbone parameter size: 29339968 load pretrain weight from /app/MixNet/pretrained_models/pre_trained_FSNet_M/triHRnet_Synth_weight.pth. Start training MixNet. Epoch: 0 : LR = [0.001] :

D641593 / MixNet

training does not converge when using dense bounding boxes dataset #9