alexis-jacq Pytorch-DPPO issues - Githubissues

alexis-jacq / Pytorch-DPPO

Pytorch implementation of Distributed Proximal Policy Optimization: https://arxiv.org/abs/1707.02286

MIT License

179 stars 40 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [100, 1]], which is output 0 of TBackward, is at version 3; expected version 2 instead.

#9 TJ2333 opened 3 years ago
2
Question on algorithm itself

#8 QiXuanWang opened 5 years ago
2
average gradients to update global theta?

#7 weicheng113 opened 5 years ago
8
Failed in more complex environment

#6 kkjh0723 closed 6 years ago
1
on advantages

#5 cn3c3p opened 6 years ago
1
Loss questions

#4 wassname closed 6 years ago
3
clamp ratio

#3 cswhjiang opened 6 years ago
1
Old policy?

#2 Kaixhin closed 7 years ago
5
Create LICENSE

#1 alexis-jacq closed 7 years ago
0