-
Our current baseline RL algorithm is DQN (more accurately it is DDQN). Named algorithm uses epsilon-greedy policies to at least have a chance of fully investigating environment in question. Using epsi…
-
Just wondering if there will be an upcoming SAC-Discrete implementation?
Thanks,
Christian
-
Dear Petros,
Thank you very much for the implementations and it is very useful. I was able to successfully execute the code in the file Cart_Pole.py. I am now am trying to run the Space_Invaders.p…
-
[Soft Actor-Critic for Discrete Action Settings](https://arxiv.org/abs/1910.07207v1)
-
Helly @AliiRezaei ,
Nice work. Thank you so much for sharing it. I am really interested particularly due to your work in C++. I am just wondering if we change the algorithm from discrete actions to…
-
Hello, I need to make SacAgent work with discrete action, so try to implement GumbelSoftmax parameterization trick by re-defining the relevant classes. However, the calculation of `agent.train(experie…
-
I applied the code of discrete sac to a custom discrete action environment. During the training process, I found that the loss of critic did not decrease but increased, and the critic-loss value after…
-
In the docs, it is mentioned about an alternate version of SAC with slight change can be used for discrete action space. Please elaborate with some more details.
-
Je trouve que la carte réseau indiquer n'est pas discrète et prend trop de place dans mon sac lors de mes repérages.
Je cherche une solution viable que me conseille tu?
-
I am implementing Soft-Actor Critic (SAC) agent and need to evaluate q-value network inside my custom environment (for the implementation of a special algorithm, called Wolpertinger's algorithm, to ha…