modern-fortran / neural-fortran

A parallel framework for deep learning
MIT License
395 stars 82 forks source link

Replacement of a matmul + use of merge #181

Closed jvdp1 closed 4 months ago

jvdp1 commented 4 months ago

As discussed, to be tested on different datasets

milancurcic commented 4 months ago

Some quick and dirty timing timings of examples/dense_mnist. This is on AMD Ryzen 5 5500U (lower end mobile CPU):

GFortran 11

ifort classic 2021.10

The overall training speed up is very nice, but the best part is that this PR also fixes the erroneous behavior with GFortran in release mode which previously required -fno-frontend-optimize.

@jvdp1 is this PR still a draft or can we mark it as "Ready for review"?

jvdp1 commented 4 months ago

Thank you @milancurcic for testing the changes. It is actually ready.

milancurcic commented 4 months ago

Excellent, I'll go ahead and merge it then. Thank you!