Open atonkamanda opened 3 years ago
Thanks for your feedback @atonkamanda, that's a very interesting remark.
I think the timeline looks like this:
max(0, x)
, now known as ReLU, but apparently people kept using step functions. Perhaps the benefits of ReLU were not apparent yet because there was no effective way of training neural nets back then. And perhaps the fact that it does not seem biologically plausible (due to the lack of saturation) may have played a role. It's hard to say.Does this sound reasonable?
In the french version "Deep Learning avec Keras et TensorFlow - 2e éd. - Mise en oeuvre et cas concrets"
page 61 index number 37 below the page
You say that biological neurons seems to implement a sigmoid curve function, which has led researchers to persist in using them, and that therefore this is an example of a case where the analogy with nature may have been misleading.
However, it seems to me that this statement is inaccurate, because based on the wikipedia page https://en.wikipedia.org/wiki/Rectifier_(neural_networks)#cite_note-Hahnloser2000-1 or the abstract of the original publication https://www.nature.com/articles/35016072 ReLu also seems to have "strong biological motivations".
I don't have enough knowledge in neuroscience to know if this is a precise point on which there is not yet a consensus even though I wanted to raise the question anyway.