Closed drujensen closed 1 year ago
Well, its a good point, right now in our testing we found it much quicker then SGD for both iris and MNIST, so I'm not sure where would the 1000 times slower part comes in, but we would love to include a new example if you want to create a spec/project using it.
We will also gladly link to an outside project if you want to use shainet for a reinforcement learning example :)
@bararchy I was hoping you already had an example. ;-) I haven't done any reinforcement learning yet but always love a challenge. I will see if I can find one that isn't too complicated and will work as an example.
That could be cool, we do have more then a few classes worth of reinforcement code, but its closed source and highly integrated into our systems.
Let me know how it works for you and @ArtLinkov and I would love to help
Reading through the ES strategy here
Quote: Note on supervised learning. It is also important to note that supervised learning problems (e.g. image classification, speech recognition, or most other tasks in the industry), where one can compute the exact gradient of the loss function with backpropagation, are not directly impacted by these findings. For example, in our preliminary experiments we found that using ES to estimate the gradient on the MNIST digit recognition task can be as much as 1,000 times slower than using backpropagation. It is only in RL settings, where one has to estimate the gradient of the expected reward by sampling, where ES becomes competitive.
It seems that ES can be 1,000 time slower when doing supervised learning. The example in the readme is doing this here and I'm wondering if we should provide a better example or at least document that this is not a recommended use of this strategy?