modern-fortran / neural-fortran

A parallel framework for deep learning
MIT License
395 stars 82 forks source link

Connect `flatten`, `conv2d`, and `maxpool2d` layers in backward pass #142

Closed milancurcic closed 1 year ago

milancurcic commented 1 year ago

cnn_mnist was previously converging to a low ~93% accuracy solution because the convolutional layers were disconnected in the backward pass, so only the output dense layer was being trained. This PR is a WIP that connects these layers. Now that they're connected, cnn_mnist doesn't converge due to not yet uncovered issues in the backward passed of conv2d and possibly maxpool2d layers as well.

certik commented 5 months ago

Git bisect revealed this PR to drop the training from > 90% to ~10% (https://github.com/modern-fortran/neural-fortran/issues/145#issuecomment-2011035667), however based on this PR description it seems this PR fixed some issues and there are more issues to fix.

Was there ever a time when the training fully worked?

milancurcic commented 5 months ago

Good question. There was a time when I thought it was converging (although not at the expected level, e.g. 96% or so accuracy, but rather at mid-80% IIRC) because I wrote a poor test. However, I think 2-d CNN training never worked correctly; it's implemented and likely a bug or two away from working but I haven't made it a priority to fix it yet. We know that inference works because we can load a pre-trained Keras CNN and infer with high accuracy. So the bug(s) is somewhere in the backward pass of one or more of conv2d, maxpool2d, flatten, and/or reshape layers.