Open iluise opened 3 weeks ago
Samples per second for temperature (local normalisation, embedding dimension 512) seem unstable:
0: epoch: 19 [219/512 (43%)] Loss: 0.26280 : 0.03152 :: 0.13843 (224.48 s/sec) 0: epoch: 19 [220/512 (43%)] Loss: 0.28077 : 0.03847 :: 0.14124 (231.14 s/sec) 0: epoch: 19 [221/512 (43%)] Loss: 0.24217 : 0.02282 :: 0.12537 (61.81 s/sec) 0: epoch: 19 [222/512 (43%)] Loss: 0.27576 : 0.03416 :: 0.14417 (234.55 s/sec) 0: epoch: 19 [223/512 (44%)] Loss: 0.27557 : 0.03425 :: 0.14246 (12.98 s/sec) 0: epoch: 19 [224/512 (44%)] Loss: 0.27298 : 0.03561 :: 0.13977 (164.26 s/sec) 0: epoch: 19 [225/512 (44%)] Loss: 0.26297 : 0.03092 :: 0.13477 (227.73 s/sec) 0: epoch: 19 [226/512 (44%)] Loss: 0.27038 : 0.03316 :: 0.13501 (14.48 s/sec) 0: epoch: 19 [227/512 (44%)] Loss: 0.26069 : 0.02907 :: 0.13635 (45.37 s/sec) 0: epoch: 19 [228/512 (45%)] Loss: 0.26004 : 0.03117 :: 0.13708 (176.89 s/sec)
That's on MareNostrum?
That's on Juelich actually. I can check on MN and post it here
If the machine in Juelich is under heavy load it might just be fluctuations in the file system latencies.
Samples per second for temperature (local normalisation, embedding dimension 512) seem unstable: