Closed jvdp1 closed 4 months ago
Some quick and dirty timing timings of examples/dense_mnist
. This is on AMD Ryzen 5 5500U (lower end mobile CPU):
main
: ~35 sjvdp1:optim
: ~20s-fno-frontend-optim
:
main
: ~25 sjvdp1:optim
: ~9smain
: ~8 s, does not converge (bad result!)jvdp1:optim
: ~6 s, converges (good result!)main
: ~8.2 sjvdp1:optim
: ~5.3 sThe overall training speed up is very nice, but the best part is that this PR also fixes the erroneous behavior with GFortran in release mode which previously required -fno-frontend-optimize
.
@jvdp1 is this PR still a draft or can we mark it as "Ready for review"?
Thank you @milancurcic for testing the changes. It is actually ready.
Excellent, I'll go ahead and merge it then. Thank you!
As discussed, to be tested on different datasets