nikhilbarhate99 PPO-PyTorch issues

nikhilbarhate99 / PPO-PyTorch

Minimal implementation of clipped objective Proximal Policy Optimization (PPO) in PyTorch

MIT License

1.63k stars 340 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Addition of some files and editting for the moving obstacle case

#71 sidwat opened 1 month ago
0
Environment setting about python version, gym and roboschool

#70 vickychen928 opened 3 months ago
0
If I run test.py, there are these problems, how to solve them?

#69 CodeKnocker closed 3 months ago
0
为什么新的版本在k epochs更新时不重新计算advantages？

#68 31CFDC30 opened 3 months ago
0
(Solved) No env.reset() at the end of each training epoch.

#67 slDeng1003 opened 5 months ago
2
the version problem about gym and roboschool

#66 ShunZuo-AI opened 7 months ago
2
policy_old完全看不出作用

#65 haduoken opened 7 months ago
6
ValueError: expected sequence of length 8 at dim 1 (got 0)

#64 kavinwkp opened 10 months ago
1
question

#63 yinshuangshuang621671 opened 1 year ago
0
optimize the existing Chinese generation model

#62 ARES3366 opened 1 year ago
0
Minor change

#61 mychoi97 closed 1 year ago
0
Continuous action space should use Independent Normal instead of MultivariateNormal

#60 imathg opened 1 year ago
1
Test results are not good

#59 295885025 opened 1 year ago
0
Would a shared network work ?

#58 Miguel-s-Amaral opened 1 year ago
0
Setting Model to eval() mode in test.py

#57 rllyryan opened 1 year ago
0
error

#56 deperado007 opened 2 years ago
5
Update from roboschool to pybulletgym

#55 rahatsantosh closed 2 years ago
1
roboschool is deprecated

#54 rahatsantosh closed 2 years ago
1
lilnoon2040@icloud.com

#53 Lilnoon2040 opened 2 years ago
2
Confusion about the loss function

#52 tlt18 closed 2 years ago
1
Convolutional?

#51 Bobingstern closed 2 years ago
1
About environment configuration

#50 BIT-KaiYu closed 2 years ago
2
how can I use this code for a problem with 3 different actions?

#49 m031n closed 3 years ago
1
How to improve the performance based on your code?

#48 4thfever closed 3 years ago
1
How are you ensuring that actions are in range of (-1,1) after sampling in continuous action

#47 PhanindraParashar closed 3 years ago
1
policy.eval() after load_state_dict()

#46 xinqin23 closed 3 years ago
1
The reward function for training?

#45 DongXingshuai closed 3 years ago
1
PPO with determinate variance

#44 keinccgithub closed 3 years ago
3
why detaching the state values when computing the advantage functions

#43 jingxixu closed 3 years ago
1
I got an error while running the program

#42 robot-xyh closed 3 years ago
2
Fix for RuntimeError for Environments with single continuous actions.

#41 Aakarshan-chauhan closed 3 years ago
1
RuntimeError for Environments with single continuous action

#40 Aakarshan-chauhan closed 3 years ago
0
Why does PPO use monte carlo estimation instead of value function estimation?

#39 outdoteth closed 3 years ago
1
Discounted Reward Calulcation (Generalized Advantage Estimation)

#38 artest08 opened 3 years ago
5
Monotonic improvement of PPO

#37 olixu closed 3 years ago
2
Performance of PPO on other projects

#36 pengzhi1998 closed 3 years ago
3
advantages = rewards - state_values.detach() problem

#35 fatalfeel closed 4 years ago
2
Question on multiple actors

#34 pengzhi1998 closed 4 years ago
2
in cuda train error expected dtype Double but got dtype Float

#33 fatalfeel closed 4 years ago
1
Unexpected key(s) in state_dict: "affine.weight", "affine.bias".

#32 fatalfeel closed 4 years ago
2
loss.mean().backward() crash

#31 fatalfeel closed 4 years ago
1
added tensorboard to track several key metrics

#30 junkwhinger closed 4 years ago
0
Question regarding state_values.detach()

#29 junkwhinger closed 4 years ago
3
I'm a beginner, and I have a question for PPO_continuous.py

#28 GrehXscape closed 4 years ago
1
RuntimeError: Error(s) in loading state_dict for ActorCritic: Unexpected key(s) in state_dict: "affine.weight", "affine.bias".

#27 nro-bot closed 4 years ago
1
Including GAE

#26 CesMak closed 4 years ago
1
Export as ONNX Model

#25 CesMak closed 4 years ago
1
can it can be used for chess?

#24 Unimax closed 4 years ago
1
Question abot PPO_continuous.py

#23 HeegerGao closed 4 years ago
2
Fix Squeeze Under 1d Action Case

#22 xunzhang closed 4 years ago
2