Closed milancurcic closed 1 year ago
Git bisect revealed this PR to drop the training from > 90% to ~10% (https://github.com/modern-fortran/neural-fortran/issues/145#issuecomment-2011035667), however based on this PR description it seems this PR fixed some issues and there are more issues to fix.
Was there ever a time when the training fully worked?
Good question. There was a time when I thought it was converging (although not at the expected level, e.g. 96% or so accuracy, but rather at mid-80% IIRC) because I wrote a poor test. However, I think 2-d CNN training never worked correctly; it's implemented and likely a bug or two away from working but I haven't made it a priority to fix it yet. We know that inference works because we can load a pre-trained Keras CNN and infer with high accuracy. So the bug(s) is somewhere in the backward pass of one or more of conv2d, maxpool2d, flatten, and/or reshape layers.
cnn_mnist
was previously converging to a low ~93% accuracy solution because the convolutional layers were disconnected in the backward pass, so only the output dense layer was being trained. This PR is a WIP that connects these layers. Now that they're connected,cnn_mnist
doesn't converge due to not yet uncovered issues in the backward passed ofconv2d
and possiblymaxpool2d
layers as well.