Hi, I try to train mobilenetv2 PruningNet from scratch with 4 v100 GPUs(batch_size=256). I find that train one batch data spend about 3 seconds (Probably because of the random crop of the network in the training process.). Is that normal? How many time do you spend for training mobilenetv2 PruningNet from scratch (64 epoch)?
part of train log:
Epoch: [0][0/5004] Time 3.857 (3.857) Data 0.000 (0.000) Loss 6.9178 (6.9178) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000)
Epoch: [0][1/5004] Time 3.421 (3.639) Data 0.000 (0.000) Loss 6.9392 (6.9285) Prec@1 0.000 (0.000) Prec@5 0.781 (0.391)
Epoch: [0][2/5004] Time 3.475 (3.584) Data 0.000 (0.000) Loss 6.9520 (6.9363) Prec@1 0.000 (0.000) Prec@5 0.391 (0.391)
Epoch: [0][3/5004] Time 3.235 (3.497) Data 0.000 (0.000) Loss 6.9477 (6.9392) Prec@1 0.000 (0.000) Prec@5 0.781 (0.488)
Epoch: [0][4/5004] Time 3.162 (3.430) Data 0.000 (0.000) Loss 6.9354 (6.9384) Prec@1 0.781 (0.156) Prec@5 0.781 (0.547)
Epoch: [0][5/5004] Time 3.129 (3.380) Data 0.000 (0.000) Loss 6.9591 (6.9419) Prec@1 0.391 (0.195) Prec@5 0.391 (0.521)
Epoch: [0][6/5004] Time 3.146 (3.347) Data 0.000 (0.000) Loss 6.9494 (6.9429) Prec@1 0.781 (0.279) Prec@5 0.781 (0.558)
Epoch: [0][7/5004] Time 3.138 (3.321) Data 0.000 (0.000) Loss 6.9903 (6.9489) Prec@1 0.000 (0.244) Prec@5 0.781 (0.586)
Epoch: [0][8/5004] Time 3.393 (3.329) Data 0.000 (0.000) Loss 6.9696 (6.9512) Prec@1 0.000 (0.217) Prec@5 0.000 (0.521)
Epoch: [0][9/5004] Time 3.495 (3.345) Data 0.000 (0.000) Loss 7.0030 (6.9563) Prec@1 0.000 (0.195) Prec@5 0.000 (0.469)
Epoch: [0][10/5004] Time 3.307 (3.342) Data 0.000 (0.000) Loss 7.0157 (6.9617) Prec@1 0.391 (0.213) Prec@5 0.781 (0.497)
Epoch: [0][11/5004] Time 3.254 (3.334) Data 0.000 (0.000) Loss 7.0124 (6.9660) Prec@1 0.000 (0.195) Prec@5 0.781 (0.521)
Epoch: [0][12/5004] Time 3.694 (3.362) Data 0.000 (0.000) Loss 7.0236 (6.9704) Prec@1 0.000 (0.180) Prec@5 1.172 (0.571)
Epoch: [0][13/5004] Time 3.186 (3.350) Data 0.000 (0.000) Loss 7.0330 (6.9749) Prec@1 0.000 (0.167) Prec@5 0.000 (0.530)
Epoch: [0][14/5004] Time 3.180 (3.338) Data 0.000 (0.000) Loss 7.0146 (6.9775) Prec@1 0.000 (0.156) Prec@5 0.781 (0.547)
Epoch: [0][15/5004] Time 3.272 (3.334) Data 0.000 (0.000) Loss 7.1130 (6.9860) Prec@1 0.000 (0.146) Prec@5 0.000 (0.513)
Epoch: [0][16/5004] Time 2.912 (3.309) Data 0.000 (0.000) Loss 7.0441 (6.9894) Prec@1 0.000 (0.138) Prec@5 0.781 (0.528)
Epoch: [0][17/5004] Time 3.199 (3.303) Data 0.000 (0.000) Loss 7.0701 (6.9939) Prec@1 0.000 (0.130) Prec@5 0.391 (0.521)
Epoch: [0][18/5004] Time 3.163 (3.296) Data 0.000 (0.000) Loss 7.1076 (6.9999) Prec@1 0.000 (0.123) Prec@5 0.000 (0.493)
Epoch: [0][19/5004] Time 3.197 (3.291) Data 0.000 (0.000) Loss 7.1321 (7.0065) Prec@1 0.000 (0.117) Prec@5 0.391 (0.488)
Epoch: [0][20/5004] Time 3.116 (3.283) Data 0.000 (0.000) Loss 7.0883 (7.0104) Prec@1 0.000 (0.112) Prec@5 0.391 (0.484)
Epoch: [0][21/5004] Time 3.464 (3.291) Data 0.000 (0.000) Loss 7.0444 (7.0119) Prec@1 0.000 (0.107) Prec@5 0.000 (0.462)
Epoch: [0][22/5004] Time 3.135 (3.284) Data 0.000 (0.000) Loss 7.0642 (7.0142) Prec@1 0.000 (0.102) Prec@5 0.391 (0.459)
Epoch: [0][23/5004] Time 3.392 (3.288) Data 0.000 (0.000) Loss 7.0659 (7.0163) Prec@1 0.000 (0.098) Prec@5 0.781 (0.472)
Epoch: [0][24/5004] Time 3.117 (3.282) Data 0.000 (0.000) Loss 7.0385 (7.0172) Prec@1 0.000 (0.094) Prec@5 0.391 (0.469)
Epoch: [0][25/5004] Time 3.271 (3.281) Data 0.000 (0.000) Loss 7.0659 (7.0191) Prec@1 0.000 (0.090) Prec@5 0.781 (0.481)
Epoch: [0][26/5004] Time 3.461 (3.288) Data 0.000 (0.000) Loss 7.0382 (7.0198) Prec@1 0.000 (0.087) Prec@5 0.391 (0.477)
Epoch: [0][27/5004] Time 2.958 (3.276) Data 0.000 (0.000) Loss 7.0603 (7.0213) Prec@1 0.000 (0.084) Prec@5 0.000 (0.460)
Epoch: [0][28/5004] Time 3.120 (3.271) Data 0.000 (0.000) Loss 7.1257 (7.0249) Prec@1 0.391 (0.094) Prec@5 0.391 (0.458)
Epoch: [0][29/5004] Time 3.212 (3.269) Data 0.000 (0.000) Loss 7.0864 (7.0269) Prec@1 0.000 (0.091) Prec@5 0.391 (0.456)
Epoch: [0][30/5004] Time 3.090 (3.263) Data 0.000 (0.000) Loss 7.1347 (7.0304) Prec@1 0.391 (0.101) Prec@5 0.391 (0.454)
Epoch: [0][31/5004] Time 2.839 (3.250) Data 0.000 (0.000) Loss 7.0732 (7.0317) Prec@1 0.000 (0.098) Prec@5 0.781 (0.464)
Epoch: [0][32/5004] Time 3.346 (3.253) Data 0.000 (0.000) Loss 7.1425 (7.0351) Prec@1 0.391 (0.107) Prec@5 0.391 (0.462)
Epoch: [0][33/5004] Time 3.508 (3.260) Data 0.000 (0.000) Loss 7.0733 (7.0362) Prec@1 0.000 (0.103) Prec@5 0.781 (0.471)
Epoch: [0][34/5004] Time 3.215 (3.259) Data 0.000 (0.000) Loss 7.1465 (7.0394) Prec@1 0.000 (0.100) Prec@5 0.000 (0.458)
Epoch: [0][35/5004] Time 3.071 (3.254) Data 0.000 (0.000) Loss 7.0800 (7.0405) Prec@1 0.781 (0.119) Prec@5 1.562 (0.488)
Hi, I try to train mobilenetv2 PruningNet from scratch with 4 v100 GPUs(batch_size=256). I find that train one batch data spend about 3 seconds (Probably because of the random crop of the network in the training process.). Is that normal? How many time do you spend for training mobilenetv2 PruningNet from scratch (64 epoch)?
part of train log: Epoch: [0][0/5004] Time 3.857 (3.857) Data 0.000 (0.000) Loss 6.9178 (6.9178) Prec@1 0.000 (0.000) Prec@5 0.000 (0.000) Epoch: [0][1/5004] Time 3.421 (3.639) Data 0.000 (0.000) Loss 6.9392 (6.9285) Prec@1 0.000 (0.000) Prec@5 0.781 (0.391) Epoch: [0][2/5004] Time 3.475 (3.584) Data 0.000 (0.000) Loss 6.9520 (6.9363) Prec@1 0.000 (0.000) Prec@5 0.391 (0.391) Epoch: [0][3/5004] Time 3.235 (3.497) Data 0.000 (0.000) Loss 6.9477 (6.9392) Prec@1 0.000 (0.000) Prec@5 0.781 (0.488) Epoch: [0][4/5004] Time 3.162 (3.430) Data 0.000 (0.000) Loss 6.9354 (6.9384) Prec@1 0.781 (0.156) Prec@5 0.781 (0.547) Epoch: [0][5/5004] Time 3.129 (3.380) Data 0.000 (0.000) Loss 6.9591 (6.9419) Prec@1 0.391 (0.195) Prec@5 0.391 (0.521) Epoch: [0][6/5004] Time 3.146 (3.347) Data 0.000 (0.000) Loss 6.9494 (6.9429) Prec@1 0.781 (0.279) Prec@5 0.781 (0.558) Epoch: [0][7/5004] Time 3.138 (3.321) Data 0.000 (0.000) Loss 6.9903 (6.9489) Prec@1 0.000 (0.244) Prec@5 0.781 (0.586) Epoch: [0][8/5004] Time 3.393 (3.329) Data 0.000 (0.000) Loss 6.9696 (6.9512) Prec@1 0.000 (0.217) Prec@5 0.000 (0.521) Epoch: [0][9/5004] Time 3.495 (3.345) Data 0.000 (0.000) Loss 7.0030 (6.9563) Prec@1 0.000 (0.195) Prec@5 0.000 (0.469) Epoch: [0][10/5004] Time 3.307 (3.342) Data 0.000 (0.000) Loss 7.0157 (6.9617) Prec@1 0.391 (0.213) Prec@5 0.781 (0.497) Epoch: [0][11/5004] Time 3.254 (3.334) Data 0.000 (0.000) Loss 7.0124 (6.9660) Prec@1 0.000 (0.195) Prec@5 0.781 (0.521) Epoch: [0][12/5004] Time 3.694 (3.362) Data 0.000 (0.000) Loss 7.0236 (6.9704) Prec@1 0.000 (0.180) Prec@5 1.172 (0.571) Epoch: [0][13/5004] Time 3.186 (3.350) Data 0.000 (0.000) Loss 7.0330 (6.9749) Prec@1 0.000 (0.167) Prec@5 0.000 (0.530) Epoch: [0][14/5004] Time 3.180 (3.338) Data 0.000 (0.000) Loss 7.0146 (6.9775) Prec@1 0.000 (0.156) Prec@5 0.781 (0.547) Epoch: [0][15/5004] Time 3.272 (3.334) Data 0.000 (0.000) Loss 7.1130 (6.9860) Prec@1 0.000 (0.146) Prec@5 0.000 (0.513) Epoch: [0][16/5004] Time 2.912 (3.309) Data 0.000 (0.000) Loss 7.0441 (6.9894) Prec@1 0.000 (0.138) Prec@5 0.781 (0.528) Epoch: [0][17/5004] Time 3.199 (3.303) Data 0.000 (0.000) Loss 7.0701 (6.9939) Prec@1 0.000 (0.130) Prec@5 0.391 (0.521) Epoch: [0][18/5004] Time 3.163 (3.296) Data 0.000 (0.000) Loss 7.1076 (6.9999) Prec@1 0.000 (0.123) Prec@5 0.000 (0.493) Epoch: [0][19/5004] Time 3.197 (3.291) Data 0.000 (0.000) Loss 7.1321 (7.0065) Prec@1 0.000 (0.117) Prec@5 0.391 (0.488) Epoch: [0][20/5004] Time 3.116 (3.283) Data 0.000 (0.000) Loss 7.0883 (7.0104) Prec@1 0.000 (0.112) Prec@5 0.391 (0.484) Epoch: [0][21/5004] Time 3.464 (3.291) Data 0.000 (0.000) Loss 7.0444 (7.0119) Prec@1 0.000 (0.107) Prec@5 0.000 (0.462) Epoch: [0][22/5004] Time 3.135 (3.284) Data 0.000 (0.000) Loss 7.0642 (7.0142) Prec@1 0.000 (0.102) Prec@5 0.391 (0.459) Epoch: [0][23/5004] Time 3.392 (3.288) Data 0.000 (0.000) Loss 7.0659 (7.0163) Prec@1 0.000 (0.098) Prec@5 0.781 (0.472) Epoch: [0][24/5004] Time 3.117 (3.282) Data 0.000 (0.000) Loss 7.0385 (7.0172) Prec@1 0.000 (0.094) Prec@5 0.391 (0.469) Epoch: [0][25/5004] Time 3.271 (3.281) Data 0.000 (0.000) Loss 7.0659 (7.0191) Prec@1 0.000 (0.090) Prec@5 0.781 (0.481) Epoch: [0][26/5004] Time 3.461 (3.288) Data 0.000 (0.000) Loss 7.0382 (7.0198) Prec@1 0.000 (0.087) Prec@5 0.391 (0.477) Epoch: [0][27/5004] Time 2.958 (3.276) Data 0.000 (0.000) Loss 7.0603 (7.0213) Prec@1 0.000 (0.084) Prec@5 0.000 (0.460) Epoch: [0][28/5004] Time 3.120 (3.271) Data 0.000 (0.000) Loss 7.1257 (7.0249) Prec@1 0.391 (0.094) Prec@5 0.391 (0.458) Epoch: [0][29/5004] Time 3.212 (3.269) Data 0.000 (0.000) Loss 7.0864 (7.0269) Prec@1 0.000 (0.091) Prec@5 0.391 (0.456) Epoch: [0][30/5004] Time 3.090 (3.263) Data 0.000 (0.000) Loss 7.1347 (7.0304) Prec@1 0.391 (0.101) Prec@5 0.391 (0.454) Epoch: [0][31/5004] Time 2.839 (3.250) Data 0.000 (0.000) Loss 7.0732 (7.0317) Prec@1 0.000 (0.098) Prec@5 0.781 (0.464) Epoch: [0][32/5004] Time 3.346 (3.253) Data 0.000 (0.000) Loss 7.1425 (7.0351) Prec@1 0.391 (0.107) Prec@5 0.391 (0.462) Epoch: [0][33/5004] Time 3.508 (3.260) Data 0.000 (0.000) Loss 7.0733 (7.0362) Prec@1 0.000 (0.103) Prec@5 0.781 (0.471) Epoch: [0][34/5004] Time 3.215 (3.259) Data 0.000 (0.000) Loss 7.1465 (7.0394) Prec@1 0.000 (0.100) Prec@5 0.000 (0.458) Epoch: [0][35/5004] Time 3.071 (3.254) Data 0.000 (0.000) Loss 7.0800 (7.0405) Prec@1 0.781 (0.119) Prec@5 1.562 (0.488)