rmojgani / RLonKorali

1 stars 3 forks source link

Memory management #6

Open rmojgani opened 1 year ago

rmojgani commented 1 year ago

Memory management: @wadaniel I have realized when I let the simulation go on for a long time horizon, the RAM usage grows, this was not a problem with me before, but now that I am using more number of agents, it's causing issues for long-time simulations

Below is a picture of what I mean, any suggestion on how to go about it is appreciated. This happens both when mode is train or test.

image

rmojgani commented 1 year ago

Unnecessary inner state was being held , it is resolved.

rmojgani commented 1 year ago

The memory consumption was mainly due to saving the forcing in " self.velist.append(self.veRL)"

https://github.com/rmojgani/RLonKorali/blob/30ee60570568c779237fc33d81b095666bb45d0e/experiments/flowControl_turb_code/_model/turb.py#L362

I removed that, and lower RAM usage, but still it increases (though with lower rate). I am thinking if it is RL' memory of past experiences?

wadaniel commented 1 year ago

yea, memory could grow until replay memory is full, can you see that?

wadaniel commented 1 year ago

also we store some historical values, like past rewards etc, but I think this is neglectable

rmojgani commented 1 year ago

yea, memory could grow until replay memory is full, can you see that?

probably that's it, can I limit the size of replay memory? what happens now is that when the RAM is full, the run gets terminated, I have written a restart function to make it work for now, but it is not an ideal solution

wadaniel commented 1 year ago

the replay memory size is defined eg here: https://github.com/rmojgani/RLonKorali/blob/main/experiments/flowControl_turb/run-vracer-turb.py#L69 it will not exceed that. and it is implemented as a ring buffer, ie if it is full and new experiences are collected, they will overwrite the old ones, so no more memory will be used.

wadaniel commented 1 year ago

can you check if the RAM usage is increasing after the exp memory is full?

image

here yo usee that 25000/1000000 experiences are store (this is the console output of one of my current runs)

rmojgani commented 1 year ago

" 25000/1000000" what are the units of these? instance? KB/MB?

wadaniel commented 1 year ago

ah the units is "experience" :D so basically if your state is 4d and action is 2d, and you use 2 agents, we will store (4+2+1)*2 floats per replay memory entry (the +1 comes from the reward)

rmojgani commented 1 year ago

so basically I can try decreasing e["Solver"]["Experience Replay"]["Maximum Size"] = 100000 to e["Solver"]["Experience Replay"]["Maximum Size"] = 10000 and check if the issue is resolved

rmojgani commented 1 year ago

this is one example of training, output image

rmojgani commented 1 year ago

does changing e["Solver"]["Experience Replay"]["Maximum Size"] effect learning in any sort? convergence of reward, converged solution ... Hmmm

wadaniel commented 1 year ago

yes it effects learning, imagine if you set maximum size to 1000, you will store only the last episode (assume its of length), and when you do the policy updates you overfit to this episode. the algos are benchmarked and work well for sizes 100k - 500k.