grooviiee / python_uav

Challenge to Reinforcement learning.
0 stars 0 forks source link

How to consider attention mexhanism? #12

Open grooviiee opened 1 year ago

grooviiee commented 1 year ago

Firstly understand mappo algo.

grooviiee commented 1 year ago

We currently understand how PPO algorithm finds best way to make agent works better.

Precisely,we need to know in deep. Firstly, understand VAE algorithm. -> GAE_ GAE는 Advantage를 구해야 하는데, 가중치(weight)를 주어서 bias(편향)를 줄이려고 한다.

Secondly, understand probability distribution. Thirdly, understand KL divergence (which is used to compare similarity)

grooviiee commented 1 year ago

Plus, most of MAPPO implementation is only one actor-critic sharing environemnts...

We need to modify training algorithm into separated actor critic network environment