anadiedrichs / drone-deep-rl

Personal repository with experimental code
Apache License 2.0
3 stars 0 forks source link

Revisit existing approaches #7

Open hidmic opened 6 months ago

hidmic commented 6 months ago

I took a look at the current code. The Webots driven Open AI gym seems correct, but I don't follow the rest. What are we expecting the agent to learn from a constant forward motion? Why are discretizing everything (and so coarsely)? What are we trying to learn here? We want to adapt or correct a user policy, not learn another that can replace them.

I'd strongly suggest we revisit the approaches we spent time studying. Both https://arxiv.org/pdf/1802.01744 and https://arxiv.org/pdf/2004.05097 are quite clear on what they are doing. There is sample code for both too, see https://github.com/rddy/deepassist and https://github.com/cbschaff/rsa.

hidmic commented 6 months ago

FYI @glpuga @olmerg