uchicago-computation-workshop / nicolas_masse

Repository for Nicolas Masse's presentation at the CSS Workshop (1/13/2019)
0 stars 0 forks source link

mathematically analyzing why this architecture works #3

Open ShuyanHuang opened 5 years ago

ShuyanHuang commented 5 years ago

Thank you for presenting such an inspiring research! While you are inspired by neuroscientific studies of the brain, I am wondering how the performance of the model can be analyzed in the context of online learning theory. If we consider the training and testing process of each permutation as a "period" in online learning and consider the goal of learning all the different tasks as optimizing the average loss of the whole learning process, traditional ANN seems like a simple Follow the Leader learning rule, while the context signal performs as a regularizing term, which makes your architecture works like a Follow the Regularized Leader rule. And the stability of Follow the Regularized Leader is probably why your architecture works better.

nmasse commented 5 years ago

I'm embarrassingly unfamiliar with online learning theory and the follow the leader learning rule! Would be happy to chat tomorrow about it.