aidudezzz / deepbots

A wrapper framework for Reinforcement Learning in the Webots robot simulator using Python 3.
https://deepbots.readthedocs.io/
GNU General Public License v3.0
230 stars 49 forks source link

Extend deepbots to support Evolutionary Algorithms #86

Open eakirtas opened 3 years ago

eakirtas commented 3 years ago

Initially deepbots was developed to support Reinforcement Learning algorithms however we expect that easily can be extended to support Evolutionary Algorithms. When it comes to evolutionary algorithm a population of agents are trained and mutated to solve a given task. At every episode the best agents are chosen to mutate in order to reach in a good enough solution.

This project is quite open. We recommend to choose an easy task such as Cartpole and adjust it on Evolutionary manner. We expect a grid of different agents that they try to solve the problem while the episodes are passed. We are open on using any evolutionary algorithm but we highly recommend to use a well established one. Finally, we expect to integrate the Evolution-Guided Policy Gradient in Reinforcement Learning as proposed in NIPS2018.

Any questions about what evolutionary algorithms can be uses, general questions or ideas are more than welcome!

veds12 commented 3 years ago

Hey @ManosMagnus I was looking into the algorithms that we could try to adjust the environments on and realised that there are quite a few of them. Do you have any preference as to which one of them I should start looking into first?

eakirtas commented 3 years ago

Hello @veds12, Indeed there are lots of genetic/evolutionary algorithm that can be used. As it is mentioned this project is quite open so feel free to share also your ideas in order to discuss them even further.

Let me share some of my ideas:

First of all, I think that we can firstly try a simple genetic algorithm in order to setup the genetic training infrastructure. I find this blog-post quite helpful. On this way, we can build a training infrastructure that is based on genetic training (using the traditional mutation/crossover scheme). I totally recommend to use a tensor-based framework (such as PyGAD and pytorch), which can be useful on future. Since the genetic algorithms follow the same philosophy, we can later easily it with other approaches (such as Elitism, Adaptive GAs, CMA-ES). Some resources:

In my prospective, the first step could be the integration of PyGAD library on a simple example such as cartpole. In this way, we can create a wrapper infrastructure than can be on genetic training.

After that, the wrapper infrastructure can be extended to support Genetic Algorithm with Neural Network. In this case, we can also use PyGAD that supports GA with NN. Another algorithm could be also the Neat which is the classical algorithm for Evolving Neural Networks. Uber-Reasearch has some great resources for this task:

According to your preference you can develop a plan that can be feasible on given GSoC's timeline. Of course the about ideas are just that, ideas. If you come up with another idea or frameworks that can be used, we can discuss it as well.

veds12 commented 3 years ago

Hey! Sorry for the late reply. I already went through some of this actually. I'll go through the rest of them in a couple of days and get back to you! We'll probably be able to have a better discussion then. Coming to frameworks, I was mostly planning to use PyTorch since I am very comfortable with that. I am not familiar PyGAD but seems worthwhile to look into it. I'll get back to you regarding this too, once I get an idea how it works 👍

veds12 commented 3 years ago

Also, wanted to ask that how would the proposal phase work? Would this be the write to start formalising the ideas or would it be better to discuss and refine them first?

eakirtas commented 3 years ago

I feel that's both way are equally good. You should not leave it for the last day of deadline. I would recommend to develop a draft that formalize your idea and then discuss your idea in reference with your proposal. So feel free to send a draft proposal in the GFOSS's list in order to provide you feedback

yiorgosynkl commented 3 years ago

Hello @ManosMagnus !! The deepbots project ideas have sparked my interest (especially this one) and I have already been experimenting with openAI, webots and deepbots. I am trying to strengthen my proposal and I was thinking that implementing a small project using deepbots would be a good idea. Do you encourage such an attempt? Do you have any suggestions? Thanks in advance

eakirtas commented 3 years ago

Hello @yiorgosynkl,

Of course we encourage such attempts. We will happy to help you. I will suggest to take a look on issue #27. We would like to extend deepbots in order to support a different kind of communication between supervisor and robot. That can happened via customData. I would recommend to replicate an easy example (such as cartpole) in order to test it. Feel free to ask for help in issue discussion

Piyush-555 commented 3 years ago

Hi @ManosMagnus , if I'm not mistaken the goal of this project is to create examples that use Genetic Algorithms (popular algorithms on suitable environments) while keeping the implementation of the algorithm fairly general as done for DDPG and PPO in the deepworlds repository. And based on the situation, we may need to use emitter-receiver and/or robot-supervisor scheme. Is that correct?

eakirtas commented 3 years ago

Exactly!

Let me elaborate this a bit. Let assume that we use the simplest possible example such as cartpole. First step might be to create a 3x3 grid in webots world of the 9 different instances of the cartpole problem (we called this as generation). Each of them (let call this robot) instance runs its own controller with the specific configuration. After the first generation the results (such as fitness value) of each robot should be send in a "central processing unit" (let call this supervisor). In turn, supervisor have to make the appropriate processing (such as mutations and combinations). After all, supervisor should reset the world as was in the first place and replace the existing robots (new generation) in the scene by passing them the new configuration that is going to be used in generation. This should be repeated until a specific condition.

Currently, deepbots support only the one-to-one (supervisor-robot) communication. In genetic algorithms might be helpful to have a one-to-many (supervisor to many robots) communication scheme. That could be feasible either with emitter-receiver or with customData field. Additionally, we need a way to formalize the scene setup. Finally, we need a abstraction or a generic way on supervisor controller to perform those functionalities (such as def create_new_generation(old_generation) def mutations(old_generatio) etc).

Of course this is just an idea, this can be also implemented in one by one manner. Both are welcome!

Please let me know if those make sense to you or if you have any concerns or improvements

Piyush-555 commented 3 years ago

@ManosMagnus Sounds good!
I think it is perfectly acceptable to modify deepbots API to support multi-agent communication, thus making the API more general.
But do we need to create abstractions in deepbots API for GA-specific methods (like create_new_generations, mutate)? Won't this be taken care of in the agent part of the implementation? Or did I misunderstood what you mean by 'abstraction'?

yiorgosynkl commented 3 years ago

Hello again! When running the CartPole tutorial in WeBots, I came across this error message, when starting the simulation:
image

It seems there is an error while importing the robot_supervisor module's classes. When running the import from python shell, a similar message occurs:

image

I tried to fix the error without any success. Do you have suggestions on how to solve this bug? Since I want to finish my GSoC proposal, I'd appreciate an answer ASAP :)

veds12 commented 3 years ago

@yiorgosynkl you either need to set the python path to the environment where you install deepbots. Or launch webots from the command-line when the environment where you installed deepbots is active

eakirtas commented 3 years ago

@Piyush-555

Yes this would be part of the agent. However, there are some "background" functionalities that the deepbots should implement. Simulation specific functionalities should be added on deepbots in order to provide an easy-to-use API to users. For example, after the and of every generation simulator should be reset at the initial state. I am not sure if mutate should be part of this API, but communication between supervisor and robots should be

SidharajYadav commented 2 years ago

I want to contribute please guide