Algorithm detail implementation

SlimShadys / PPO-StableBaselines3

This repository contains a re-implementation of the Proximal Policy Optimization (PPO) algorithm, originally sourced from Stable-Baselines3.

1 stars 0 forks source link

Algorithm detail implementation #1

Open lqhdehub opened 4 weeks ago

lqhdehub commented 4 weeks ago

Hello, I want to ask if this code is written based on ppo code in stable_baselines3? Are all the algorithmic details included? Thank you for your contributions.

SlimShadys commented 3 weeks ago

Hey! Yes, this code is based on the PPO implementation from Stable Baselines 3. It's a simplified version due to the removal of all the library-specific lines of code. It just shows how PPO truly works (without all the Abstract Classes which SB3 contains, all the various links etc).

It's just missing the Frame-Environment support such as CarRacing (in fact it's in the ToDo list), which processes the game one (or multiple) frames at a time.

Thank you.

lqhdehub commented 3 weeks ago

Thank you for your reply. What you mean is that this code basically contains all the technical details of ppo in SB3 that allow for better control (not abstract classes, all sorts of links).

However, I see that your code seems to have only discrete actions, but no continuous actions, I would like to ask if in the continuous action space, the scope of the action space is inconsistent (divided into two kinds), can the single agent ppo control the object well, or is there any way to make the control effect better? Thank you very much. Have a nice day