PPO instead of PPO-M - Githubissues

murtazabasu commented 4 years ago

Hi, I am following your code for my implementation. But after seeing the results (which are good by the way) I want to give a shot with PPO. I know there are some implementations already in other repos but I find this one pretty easy to follow. I just want to ask what kind of modification I would need to do in your code for the PPO implementation? And is it recommended to do modifications in this code or follow other repo? There are some good repos out there but the problem is that they are specially made for ATARI and MUJOCO environments using baselines from deepmind which are difficult to modify for my environment and also I need Python 3.5+ to work with the baselines. But since I am working with ROS Melodic which comes with default Python 2.7 I can't really use those baselines. Any suggestions would be grateful.

nikhilbarhate99 commented 4 years ago

Hey, I would not recommend using this repo for complicated environments. It is very simplified version for understanding / learning PPO. I would suggest you write an gym api for your env and use other repos instead of trying to modify them.

murtazabasu commented 4 years ago

That would be an option but as I mentioned before I am working with ROS which comes with Python 2.7 version and so writing a gym API wouldn't really help as I would still need to use the "baselines" from deepmind for VecNormalizing the Env which is only possible with Python 3.5+.

nikhilbarhate99 / PPO-PyTorch

PPO instead of PPO-M #20