Closed toshikwa closed 3 years ago
Cheers for the nice comments :).
We are (still) working on getting v1.0 out, i.e. mainly bug testing and reviewing of the code. After the release we can discuss adding new algorithms or improvements to existing algorithms. On a quick glimpse this seems simple enough that it could be added with not much extra code.
Hello,
Thanks for the suggestion =)
In principle I would be for that addition. We mostly need to discuss the advantage of it vs DQN and variants (QR-DQN, ...) in term of performance and runtime and see how much effort it requires and complexity it adds.
@Miffyli maybe a good candidate for stable-baselines3 "contrib" (same as #83 )
Thank you for the response.
According to the paper, SAC-Discrete is evaluated with 100k environment steps because they are most interested in sample efficiency, not final performance.
Its results at 100k steps were not bad, but it failed to solve some simple tasks like Pong. DQN (and its extensions) can get much better result although needs more samples. I would say there is a trade-off. (What do you think?)
Once v1.0 is released, I can contribute to implementing QR-DQN and IQN, in addition to SAC-Discrete.
Thanks :)
The contrib repo is here ;) https://github.com/Stable-Baselines-Team/stable-baselines3-contrib
make sure to read the contributing guide carefully first ;). In term of priority, I would prefer QR-DQN and IQN first. For QR-DQN, you can re-use the huber quantile loss defined in TQC.
(we don't advertise it yet as we want to check the process and not get too many request for now)
I was asked to post it here, @PartiallyTyped, regarding the following comment. https://github.com/DLR-RM/stable-baselines3/issues/1#issuecomment-625938738
PartiallyTyped posted an academic paper link for a SAC algorithm that takes a discrete input.
I think PartiallyTyped is already aware since the main github link was mentioned on the paper page, there is a source code example for it. The author publicised his code. https://github.com/p-christ/Deep-Reinforcement-Learning-Algorithms-with-PyTorch/blob/master/agents/actor_critic_agents/SAC_Discrete.py
Hope this helps, Sean
I would now close this one as it rather belongs the contrib repo.
PartiallyTyped posted an academic paper link for a SAC algorithm that takes a discrete input.
Academic, yes, but not peer-reviewed...
@araffin How about the following paper? https://arxiv.org/abs/1912.11077v1
Hi, thank you for your great work!! I'm interested in contributing to Stable-Baselines3.
I want to implement SAC-Discrete(paper, my implementation). Can we discuss before implementing??