Closed csingh27 closed 3 years ago
Set memory limit to a specific process ram - How to set a memory limit for a specific process? - Ask Ubuntu
https://askubuntu.com/questions/510913/how-to-set-a-memory-limit-for-a-specific-process https://dev.to/rrampage/surviving-the-linux-oom-killer-2ki9 https://www.tecmint.com/find-process-name-pid-number-linux/#:~:text=To%20find%20out%20the%20PID%20of%20a%20process%2C,print%20its%20name%20using%20the%20command%20form%20below%3A
Process gets killed by OOM when its oom_score is high Keep on checking oom_score in /proc/PID/oom_score Also set /proc/PID/oom_score_adj to -1000 so that it does not fail
https://www.percona.com/blog/2019/08/02/out-of-memory-killer-or-savior/ Reasons could be out of space OR out of memory
Python3 process that I am running has a high OOM_score and hence gets killed. Changing the oom_score_adj to -1000 makes the oom_score 0 which prevents the process from failing. But the kernel crashes because it still utilises all the memory.
So I need a way to actually restrict the memory usage of this process to a limit so that it does not eat up all the resources.
On observing it is evident that the memory as visible in the System Monitor gets filled. I need to find a way to avoid it filling.
Tried : Reducing batch size to 2. Did not work It makes sense that the data accumulates. and does not
But no, it was running properly for same no. of steps on office laptop. So it does not have to do with pytorch.
It is some memory problem actually.
The problem is due to matplotlib plots just getting saved in the memory !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!FOUND THE PROBLEM
Yes that solves a major problem. However apart from the plots some other values are also getting accumulation in Memory. Need to find that out.
It could be because of action.cpu() from choose_actions function.As it performs it on cpu
Added gc.collect() hopefully this solves it
No. It doesn't need to find another way to deallocate that memory
To do : Identify what causes a major increase in RAM usage Change batch size to see if it makes a difference Change image size Don't save any data
Memory also accumulates on office laptop
Problem is when appending all the network outputs to a dictionary. This accumulates !
Saving separate csv files after each iteration and emptying the dictionary containing the relevant variables after that. Also, not saving the log_sac_inputs because of its huge size.
Problem is due to appends. Check issue #61
How to Find a Process Name Using PID Number in Linux (tecmint.com) Type "top" in terminal Surviving the Linux OOM Killer - DEV Community
sudo echo -1000 > /proc/5302/oom_score_adj
where 5302 is the PID for the python3 process