Our current baseline RL algorithm is DQN (more accurately it is DDQN). Named algorithm uses epsilon-greedy policies to at least have a chance of fully investigating environment in question. Using epsi…
Just wondering if there will be an upcoming SAC-Discrete implementation?
Dear XuanCe Development Team,
Thank you for your contributions to the field of multi-agent reinforcement learning! I noticed that the MASAC algorithm in the XuanCe project references the …
Dear Petros,
Thank you very much for the implementations and it is very useful. I was able to successfully execute the code in the file Cart_Pole.py. I am now am trying to run the Space_Invaders.p…
[Soft Actor-Critic for Discrete Action Settings](https://arxiv.org/abs/1910.07207v1)
Helly @AliiRezaei ,
Nice work. Thank you so much for sharing it. I am really interested particularly due to your work in C++. I am just wondering if we change the algorithm from discrete actions to…
Hello, I need to make SacAgent work with discrete action, so try to implement GumbelSoftmax parameterization trick by re-defining the relevant classes. However, the calculation of `agent.train(experie…
I applied the code of discrete sac to a custom discrete action environment. During the training process, I found that the loss of critic did not decrease but increased, and the critic-loss value after…
In the docs, it is mentioned about an alternate version of SAC with slight change can be used for discrete action space. Please elaborate with some more details.
Je trouve que la carte réseau indiquer n'est pas discrète et prend trop de place dans mon sac lors de mes repérages.
Je cherche une solution viable que me conseille tu?