Open khawar-islam opened 3 years ago
I am doing training as same as your GPU Tesla V100. How many days are required for one epoch?
It seems that it takes about 5 days to train a model.
Thank you @zhongyy
Hi,
I am doing it again and currently, the performance is below 60 in all datasets and batch count is
highest_acc: [0.5375, 0.5388333333333333, 0.5048333333333334, 0.517, 0.562142857142857, 0.5076666666666666]
Epoch 1 Batch 17650 Speed: 5.85 samples/s Training Loss 33.7908 (33.8089) Training Prec@1 0.000 (0.000)
Epoch 1 Batch 17660 Speed: 212.87 samples/s Training Loss 33.6953 (33.7547) Training Prec@1 0.000 (0.000)
Learning rate 0.000001
Perform Evaluation on ['lfw', 'talfw', 'calfw', 'cplfw', 'cfp_fp', 'agedb_30'] , and Save Checkpoints...
(12000, 512)
How many days does it take to train completely?
Hi,
I am doing it again and currently, the performance is below 60 in all datasets and batch count is
highest_acc: [0.5375, 0.5388333333333333, 0.5048333333333334, 0.517, 0.562142857142857, 0.5076666666666666] Epoch 1 Batch 17650 Speed: 5.85 samples/s Training Loss 33.7908 (33.8089) Training Prec@1 0.000 (0.000) Epoch 1 Batch 17660 Speed: 212.87 samples/s Training Loss 33.6953 (33.7547) Training Prec@1 0.000 (0.000) Learning rate 0.000001 Perform Evaluation on ['lfw', 'talfw', 'calfw', 'cplfw', 'cfp_fp', 'agedb_30'] , and Save Checkpoints... (12000, 512)
How many days does it take to train completely?
My training speed is about 280 samples/s. The acc on test set (LFW ...) will not be high since the lr of the first epoch is 0.000001 (warmup).
@zhongyy Can you share your training log?
I am doing training as same as your GPU Tesla V100. How many days are required for one epoch?