DaDucking / PPOAttention

0 stars 0 forks source link

Performance Comparison Results #1

Open namjiwon1023 opened 1 year ago

namjiwon1023 commented 1 year ago

Thank you for your contribution !

I want to know in all of the self-attention RL algorithms, which one has the best performance?

Thank you !

DaDucking commented 1 year ago

Hi Namjiwon1023,

Thank you for taking interest in my previous experiment.

Based on the experiments, I previously concluded that Channel-wise Self Attention Network(C-SAN, the rvuattn code) is better on average (15%) for training a more efficient model. It could be due to the complexity of the tasks, number of elements in the environment or dynamic variables. There is also a caveat that a non-significant amount of no-attention models actually performed better in certain scenarios (Roughly 20%).

The scope of this experiment is also extremely narrow, especially since it is only in the context of Atari2600 simple games. It may not translate well into other fields. I have never tried it on my part because of limited time and resources and have since dropped efforts in understanding what makes C-SAN variant superior in my experiments.

I would like to think that overall, C-SAN (the rvuattn code) could provide a better state representation which makes it easier for the underlying RL model to learn. If you were to give it a try, do let me know if it works or doesn't.

Thanks!