Refactor for convnets - Githubissues

The original neural-fortran code was limited in application because the network type was hardcoded for dense (fully-connected) layers. This PR introduces a large refactor of the library to allow extending it to other network architectures (convolutional for imagery and model data, recurrent for time series, etc.).

Key changes:

Define forward and backward passes on the layer instead of the network (fixes #41)
Weights are now defined on the same layer as biases and outputs, as is more conventional, rather than on the preceding layer as it was in the original code. There is no significant practical implication here other than that the algorithm implementation is easier to track in the conventional form.
Input layers are now their own layer type, similar to the Keras API. 1-d (for dense networks) and 3-d (for convolutional networks) input layers are provided under the same generic name (input).
Losses are in their own module, albeit still only one function (quadratic).
Convolutional layer is only a placeholder for now.
Tests, though minimal for now, are quiet.
Source file names and modules are now prefixed with nf_ instead of mod_, to minimize the chance of name clashes with other libraries that may enter the same namespace in a user application.
Not using preprocessor macros

What's not there anymore:

Support for real64 or real128. Rationale: Not too useful to begin with, and can be easily added if anybody asks for it.
save and load methods to save and load pre-trained networks. Rationale: we'll be adding support for HDF5 I/O soon, and I assume most people who used save and load did it via FKB rather than the upstream neural-fortran.

A nice side-effect of this refactor is that the MNIST training example is about 135% (2.35 times) faster than the original code. This is likely due to the fact that this time around I was careful about minimizing copies and re-allocations. This result is with ifort-2021.3 using -Ofast on Intel E5-1650.

Known issues:

With higher optimization levels on GFortran (anything above -O0), the network does not converge as expected, and this is true for all 3 included examples. For example, the MNIST example it reaches high 80% in one epoch and then slowly drops in subsequent epochs. Same behavior with 9.4.0 and 10.3.0. This issue goes away with -O0, and doesn't appear at any optimization level with ifort. I hope to diagnose and resolve this before the merge.

TODO before merging:

[x] Tag and release v0.2.0 from the main branch

CC @katherbreen

Known issues:

With higher optimization levels on GFortran (anything above -O0), the network does not converge as expected, and this is true for all 3 included examples. For example, the MNIST example it reaches high 80% in one epoch and then slowly drops in subsequent epochs. Same behavior with 9.4.0 and 10.3.0. This issue goes away with -O0, and doesn't appear at any optimization level with ifort. I hope to diagnose and resolve this before the merge.

Adding -fno-frontend-optimize allows GFortran to generate code that runs correctly (examples converge) at any optimization level, including -Ofast. So, -ffrontend-optimize, implied for any optimization level above -O0, seems to cause the issue. I don't know exactly why yet. From the GFortran manual:

       -ffrontend-optimize
           This option performs front-end optimization, based on manipulating parts the Fortran parse tree.  Enabled by default
           by any -O option except -O0 and -Og.  Optimizations enabled by this option include:

           *<inlining calls to "MATMUL",>
           *<elimination of identical function calls within expressions,>
           *<removing unnecessary calls to "TRIM" in comparisons and assignments,>
           *<replacing TRIM(a) with "a(1:LEN_TRIM(a))" and>
           *<short-circuiting of logical operators (".AND." and ".OR.").>

           It can be deselected by specifying -fno-frontend-optimize.

Of these, inlining calls to "MATMUL" and elimination of identical function calls within expressions seem like candidates for the cause of the issue. I don't know if this list of optimizations is a complete list or a subset.

modern-fortran / neural-fortran

Refactor for convnets #58