eleurent / rl-agents

Implementations of Reinforcement Learning and Planning algorithms
MIT License
582 stars 152 forks source link

How to contribute? #41

Open mhtb32 opened 4 years ago

mhtb32 commented 4 years ago

I wanted to know if a contribution is welcomed here, and if it is, how to contribute? I mean, is there any guideline for how we should implement agents?

In fact, I wanted to implement agents like DDPG, SAC, and TD3. Are these in the scope of this project?

eleurent commented 4 years ago

Contributions are absolutely welcome, and guidelines are unfortunately lacking. I'll try to address this. The main steps would be:

You can have a look at the implementation DQN for guidance (which is split between abstract.py and pytorch.py files for historical and deprecated reasons)

Policy gradient algorithms are definitely in the scope of this project, I wanted to implement a few myself (see #4) but never found the time. You're welcome to try it and I'll provide any support you require.

You should also know that there are two kind of ways to train agents, defined in the Evaluation class. The default one (run_episodes) is the following:

for episode in episodes:
    action = agent.act(state)
    next_state, reward, done, info = env.step(action)
    agent.record(state, action, next_state, reward, done, info)

But alternatively, the (run_batched_episodes) method allows to run a batch of sample collection jobs in parallel, before updating the model. Something like

for episode in episodes:  # in parallel
    action = agent.act(state)
    next_state, reward, done, info = env.step(action)
    agent.record(state, action, next_state, reward, done, info)
agent.update()

This is probably relevant for policy gradient algorithms.

mhtb32 commented 4 years ago

Thanks for your explanation. I'll keep this issue open so it can act as a temporary contribution guide for others.