Open nedtaylor opened 1 month ago
This has been implemented as of df2e707452b832ba4dbd107e17d16f67a5b2ef70 and it reproduces the same result as before for the mnist example. From my initial testing, this implementation has resulted in no speed reduction to the code for the mnist example.
Notes 1) this has temporarily removed the ability to provide skip inputs (previously implemented as addit_input), but the framework is there to accept any number of input layers that can feed in at multiple points to the network, so should be even better implementation. 2) Convolutional layers still allow padding as before, but this NEEDS to be handled better (via the specific layer) instead of changing the prior inputs. Issue #6 has been open for some time with the plan to resolve this.
With the initial implementation showing good results, I am going to move this to an intended feature of version 2 as it works well alongside the change in inputs and outputs #19.
Reasoning
Skip layers help to mitigate vanishing gradient problem. They are a fundamental part of many modern neural networks.
Prior Art
(https://www.tensorflow.org/api_docs/python/tf/keras/layers/Add)
Tensorflow implementation of residual block: https://github.com/christianversloot/machine-learning-articles/blob/main/how-to-build-a-resnet-from-scratch-with-tensorflow-2-and-keras.md
Additional information
No response