issues
search
alexis-jacq
/
Pytorch-DPPO
Pytorch implementation of Distributed Proximal Policy Optimization: https://arxiv.org/abs/1707.02286
MIT License
179
stars
40
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [100, 1]], which is output 0 of TBackward, is at version 3; expected version 2 instead.
#9
TJ2333
opened
3 years ago
2
Question on algorithm itself
#8
QiXuanWang
opened
5 years ago
2
average gradients to update global theta?
#7
weicheng113
opened
5 years ago
8
Failed in more complex environment
#6
kkjh0723
closed
6 years ago
1
on advantages
#5
cn3c3p
opened
6 years ago
1
Loss questions
#4
wassname
closed
6 years ago
3
clamp ratio
#3
cswhjiang
opened
6 years ago
1
Old policy?
#2
Kaixhin
closed
7 years ago
5
Create LICENSE
#1
alexis-jacq
closed
7 years ago
0