Closed nzmora closed 5 years ago
Lots of good stuff in here!
- Control over parallelization
Yes, please. I really want to be able to do graph_manager.improve(num_workers=4) or something similar. I can't really work out how to do this at the moment. The advice seems to be to subclass CoachLauncher?
- Simplification of the exported name-space.
Yes, nice to have. PEP8 says "packages should also have short, all-lowercase names, although the use of underscores is discouraged".
from rl_coach.agents.ddpg_agent import DDPGAgentParameters
could be
from rl_coach.agents
agents.ddpg.Parameters
A short term fix could be to add an api module which has saner structure to allow you to change the library structure, (eg import rl_coach.agents.api as agents).
@shadiendrawis
How can I get access to the agent object from the graph_manager
I only have access to the Agent Parameters?
It would be good to have a way to access the agent object. This would enable to do many things after training when the model is saved and re-loaded. Eg: to save the memory buffer when training ends, re-initialize/load memory, etc....
PR #348 addresses the points raised in this issue.
@nitsanluke if you are using BasicRLGraphManager you can use the get_agent method to get access to the agent object, in the more general case where there could be several hierarchal levels and several agents you'll need to access a specific agent at a specific level by using graph_manger.level_managers[LM_index].agents['agent_name'].
Take a sip of your coffee and sit back: this is going to be long :-(
Background: Recently I've created a sample Distiller application which uses RL agents (DDPG; Clipped PPO) to automate DNN compression (WiP). In this use-case, a compression application creates a
DistillerWrapperEnvironment
, a Coachgraph_manager
, and a Coachagent_params
from a preset file that is stored in the Distiller repo.The Automated Deep Compression (ADC) application populates the graph manager with the environment details, creates a graph and executes
graph_manager.improve()
.In this use-case, Coach is used as a library, serving an application. As far as I know, this is the first time Coach is used as a library and not an application. The Coach-as-a-library use case works well but Coach is lacking a couple of features that would make it a bit more user-friendlyl. Not all of these are equally important:
rl_coach.agents.ddpg_agent
can be exported directly fromrl_coach.agents
(reducing the import depth from 3 to 2).from rl_coach.memories.memory import MemoryGranularity from rl_coach.base_parameters import EmbedderScheme from rl_coach.architectures.tensorflow_components.layers import Dense
env_params = GymVectorEnvironment() env_params.level = '../automated_deep_compression/ADC.py:DistillerWrapperEnvironment'