cviaai / RL-DBS

Reinforcement learning for deep brain stimulation (DBS) modeling
MIT License
21 stars 2 forks source link

Project Page Stable Baselines #1

Open araffin opened 4 years ago

araffin commented 4 years ago

Hello,

Nice project =)

I created a colab notebook to try it online directly: https://colab.research.google.com/drive/19bdAiKZY0r5OR3gEv7164CjDOdMRGYqt

Btw, why didn't you use deterministic=True for the prediction? (this would suppress the exploration noise)

Quick question: did you try other algorithms that are usually more suited for continuous actions? (like soft actor-critic (SAC), DDPG and TD3 which should be more sample efficient too)

We would also be interested if you could do a pull request on stable-baselines where you add your project to the documentation (project section) ;)

PS: I tried with SAC (with parameters from the original paper) on on your environment and I could get (apparently) good results in 2e5 steps, the plot:

result

And the ratio of stds:

>print(np.sqrt(s))
1509
araffin commented 4 years ago

PS: here is the result using the trained model and deterministic actions: ppo_deterministic

>print(np.sqrt(s))
9.9
cviaai commented 4 years ago

Thank you, Antonin!

The first one really looks overfit, but the second one is very cool and converges to the right equilibrium!

With regard to continuous action, we did try several other algorithms; but because we want to simulate real DBS devices (which send pulsatile stimuli into the brain) -- the current configuration is actually more pertinent to the real life.

The same applies to the exploration noise. There is a fundamental limit called finite-size fluctuation, beyond which it is impossible to suppress the network of neurons. So, having that noise is actually useful. Whether it is better to first find the best algorithm and then test stability to noise, or vice versa is still an open question (see speculations in "krylov-DBS-RL-paper") - the reason: strong nonlinear response of the environment. We will look into your suggestions!

Also, we will get back to you on Monday with pull requests, etc.