I've noticed memory use grows with time in a coach learning process. This matters because RL training can take a long time and lower memory usage can reduce learning costs.
I instrumented the agent train loop with tracemalloc and looked at the top memory allocators:
Ignoring the tracemalloc ones we see the top allocator is in Transition.add_info, traceback
File ".../coach/rl_coach/core_types.py", line 250
self.info.update(new_info)
File ".../coach/rl_coach/agents/agent.py", line 870
transition.add_info(self.last_action_info.__dict__)
File ".../coach/rl_coach/level_manager.py", line 219
done = acting_agent.observe(env_response)
File ".../coach/rl_coach/graph_managers/graph_manager.py", line 443
result = self.top_level_manager.step(None)
File ".../coach/rl_coach/graph_managers/graph_manager.py", line 476
self.act(EnvironmentSteps(1))
This seems like a leak, since transition objects should be free for gc?
Any advice here? I can't reason what might be keeping a reference to the transition.
I think what was happening was that some code I added to the fetch list was allocating variables and this was causing growth in memory. I haven;t worked out why the tracemalloc output is misleading yet :/
I've noticed memory use grows with time in a coach learning process. This matters because RL training can take a long time and lower memory usage can reduce learning costs.
I instrumented the agent train loop with tracemalloc and looked at the top memory allocators:
Ignoring the tracemalloc ones we see the top allocator is in Transition.add_info, traceback
This seems like a leak, since transition objects should be free for gc?
Any advice here? I can't reason what might be keeping a reference to the transition.