Open MARD1NO opened 2 years ago
与fp32区别主要是:
lr -> 0.003 adam的eps增大到1e-4 lr_scheduler做了一些改动
Rank[0], Epoch 7, Step 70000, AUC 0.795594, LogLoss 0.123677, Eval_time 24.48 s, Metrics_time 4.56 s, Eval_samples 89140000, GPU_Memory 14512 MiB, Host_Memory 12285 MiB, 2022-07-27 10:25:39 Rank[0], Step 71000, Loss 0.1217, Latency 9.213 ms, Throughput 6001880.3, 2022-07-27 10:25:48 Rank[0], Step 72000, Loss 0.1199, Latency 9.162 ms, Throughput 6035043.0, 2022-07-27 10:25:57 Rank[0], Step 73000, Loss 0.1265, Latency 9.307 ms, Throughput 5941121.9, 2022-07-27 10:26:07 Rank[0], Step 74000, Loss 0.1234, Latency 9.534 ms, Throughput 5799602.2, 2022-07-27 10:26:16 Rank[0], Step 75000, Loss 0.1247, Latency 9.194 ms, Throughput 6014392.4, 2022-07-27 10:26:25 ================ Test Evaluation ================ Rank[0], Epoch 7, Step 75000, AUC 0.801010, LogLoss 0.126012, Eval_time 19.27 s, Metrics_time 4.80 s, Eval_samples 89140000, GPU_Memory 14512 MiB, Host_Memory 9733 MiB, 2022-07-27 10:26:50
在fp32脚本上调小lr也可以,DCN AMP lr=0.001
================ Test Evaluation ================ Rank[0], Step 75000, AUC 0.803214, LogLoss 0.000000, Eval_time 2.04 s, Metrics_time 2.24 s, Eval_samples 89192448, GPU_Memory 14507 MiB, Host_Memory 6214 MiB, 2022-07-27 15:47:50
与fp32区别主要是:
lr -> 0.003 adam的eps增大到1e-4 lr_scheduler做了一些改动
在fp32脚本上调小lr也可以,DCN AMP lr=0.001