the-anylogic-company / AnyLogic-Pypeline

A custom AnyLogic library for running Python inside an AnyLogic model (Java)
https://github.com/the-anylogic-company/AnyLogic-Pypeline/wiki
MIT License
107 stars 27 forks source link

Reinforcement learning #14

Closed doncha88 closed 3 years ago

doncha88 commented 3 years ago

Can you please tell me if there will be a version for training with reinforcement learning for python? And when will it be if it will be?

t-wolfeadam commented 3 years ago

Hi, there is something similar that you are referring to that is in the prototyping stage at the moment. There's no dedicated time of release, though possibly towards the end of Q2 or Q3 of this year. I'll be updating the readme for this project when a more concrete timeline is in place.

For now, you can always use Bonsai or Pathmind.

t-wolfeadam commented 3 years ago

(To further clarify, this will be its own separate project, as RL training is not possible with Pypeline)

doncha88 commented 3 years ago

Thanks for the information. Yes, I looked at these libraries, DL4J is also common now for reinforcement learning, but they are not in python, because of this the problem (. Are there any methods that can force the pypeline to bypass the prohibitions and do reinforcement learning? maybe have some kind of life hack

t-wolfeadam commented 3 years ago

There's nothing [truly] inherent that restricts you from trying to do RL training using Pypeline. I summarized why this is infeasible below, but for it explained verbally, see here, and for an explanation with more visuals and in more detail, see the last two questions of this Q&A.

In short, the problem is that with Pypeline, AnyLogic the parent process of Python. If you close the simulation model, the Python connection dies with it - just like if you're running Python in a command prompt and then you close the window. With RL, AnyLogic is also the environment for the learning agent. You could start an RL training, but it would only last one episode. Once the AI agent wants to reset the environment, it will also close the Python connection (because it's ending the simulation model). The only way around this would be to save the state of the AI at the end of each sim run and resume it at the start of each sim run (But that would be horribly inefficient). The only other way is to host your AI agent as a web app of sorts and communicate with it via HTTP requests (or similar).

doncha88 commented 3 years ago

thanks for the detailed answer