soft-actor-critic Search Results

384 results
for soft-actor-critic

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

PaddlePaddle/PARL #271

不知可否增加一个SAC的离散action版本？

理论上这个是可行的，但我不知道这个在paddle parl这里怎么写，不知可否官方实现下：我能找到的参考资料有: https://stackoverflow.com/questions/56226133/soft-actor-critic-with-discrete-action-space https://www.spaces.ac.cn/archives/6705 感谢。 …

dbsxdbsx updated 4 years ago
3
Curt-Park/rainbow-is-all-you-need #35

input state-action pair into Rainbow DQN

Hi, I was thinking of incorporating the action (in addition to state) as a state-action pair input into the [rainbow dqn](https://nbviewer.jupyter.org/github/Curt-Park/rainbow-is-all-you-need/blob/mas…

junhuang-ifast updated 4 years ago
1
toshikwa/sac-discrete.pytorch #6

Policy improvement

Hello I readed Soft actor critic article and in soft policy iteration section, I saw policy improvement step which used Kullback-Leibler divergence. Does soft actor critic for discrete actions also u…

ahmadreza9 updated 4 years ago
5
rishistyping/AWS_JPL_OSR_DRL #8

Soft Actor Critic

This was assigned to me by @wakeuplearn https://towardsdatascience.com/soft-actor-critic-demystified-b8427df61665 Spinning Up Docs are also good: https://spinningup.openai.com/en/latest/algorithms/…

Gizmotronn updated 4 years ago
2
pytorch/pytorch #38480

RuntimeError: one of the variables needed for gradient compu…

## 🐛 Bug I encountered this error that (I think?) is a version mismatch error: `RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [tor…

ShivanshuPurohit updated 4 years ago
1
yanpanlau/DDPG-Keras-Torcs #11

Has anyone tried using image input?

Hello, Has anyone tried using image as input to train the network? I have worked that for couple of days using a 3 layers conv net to process image substituted original low-dimensional states, but…

sufengniu updated 4 years ago
41
shariqiqbal2810/MAAC #19

does training advantage soft actor critic based on replay bu…

Dear Author, I take a fast look at your code on actor updates. It seems that you have use advantage soft actor critic, i.e., Advantage: `pol_target = q - v` loss: `pol_loss = (log_pi * (log_pi /…

KK666-AI updated 4 years ago
7
acord-robotics/stellarios #107

stellarios/awsjplosrdrl-dimensions

# OSR Dimensions | Stellarios Dimensions for SpaceChallenge.tech [https://acord-robotics.github.io//stellarios/awsjplosrdrl-dimensions/](https://acord-robotics.github.io//stellarios/awsjplosrdrl-dim…

utterances-bot updated 4 years ago
4
stepjam/PyRep #157

Coppeliasim environment thread conflict

hi, To perform the SAC (soft actor and critic) algorithm combined with Tensorflow2.0 and Pyrep, one tricky thing is that we have to create two different environments: one for collection and one fo…

wawachen updated 4 years ago
1
openai/spinningup #213

Soft Actor-Critic: is Q2 redundant in current implementation…

I am unable to find any place in the code, where Q1 and Q2 would be used asymmetrically, which makes me think: is Q2 completely redundant?

lostmsu updated 4 years ago
1

上一页 1...30 31 32 33 34 35 36...39 下一页

384 results for soft-actor-critic

384 results
for soft-actor-critic