aqeelanwar / PEDRA

Programmable Engine for Drone Reinforcement Learning Applications
MIT License
269 stars 60 forks source link

Average Return #81

Open levicorpus0407 opened 3 years ago

levicorpus0407 commented 3 years ago

Hi. I tried to use this open source project to compare the performances of different algorithms. According to the definition of Return in the codes, it should keep increasing? However, I ran the codes(using DQN, REINFORCE and PPO) for multiple times and plotted drone0/Return in tensorboard log, the average return seemed to converge. I changed the network structure from C3F2 to AlexNetDuel for DQN algorithm. The return figure seemed to match the paper you provide. But REINFORCE and PPO is still not be able to navigate the drone for a long safe distance. Or I missed something... Could you please give me some guidance on this issue? Thanks for your work. It helps a lot.