Open vivasvan1 opened 4 years ago
Can you check if it is only on my pc or is this happening with your code too?
also, is there any way in which I don't have to load the full dataset on the memory for training in mxnet?
I have found using pdb that after every run of
batch = batch_queue.get()
an extra 0.10-0.15% ram is consumed which seems to never get released.
(Pdb) print("| ram=",psutil.virtual_memory().available * 100 / psutil.virtual_memory().total)
**ram= 28.66921519345203**
(Pdb) n
> /home/mask/maskflownet/MaskFlownet/main.py(572)<module>()
-> loading_time.update(default_timer() - t0)
(Pdb) print("| ram=",psutil.virtual_memory().available * 100 / psutil.virtual_memory().total)
**ram= 28.542640291935687**
I cannot find why this is happening but i am sure of it. Can you help me fix this please?
Hi vivasvan1, thanks for pointing out this problem.
We import this Queue
method from the python queue
package directly without any modification. I search on Google and find that other people encounter the same problem. So maybe this is not a problem with our code but the python queue
package.
I have noticed that your training loop leaks small amounts of RAM memory.Any idea on what may have caused this?
time taken= 9.865329265594482 | steps= 1 | cpu= 51.8 | ram= 34.50078675328186 | gpu= [3101] [5613] time taken= 0.934636116027832 | steps= 2 | cpu= 27.0 | ram= 29.34866251942084 | gpu= [5613] [3045] time taken= 0.8695635795593262 | steps= 3 | cpu= 29.4 | ram= 29.217970957706278 | gpu= [3045] [3021] time taken= 0.8483304977416992 | steps= 4 | cpu= 29.8 | ram= 29.033316428574086 | gpu= [3021] [2997] time taken= 0.8630681037902832 | steps= 5 | cpu= 30.2 | ram= 28.87988403913803 | gpu= [2997] [2997] time taken= 0.8645083904266357 | steps= 6 | cpu= 29.4 | ram= 28.714746447210654 | gpu= [2997] [2997] time taken= 0.864253044128418 | steps= 7 | cpu= 29.3 | ram= 28.573093657739385 | gpu= [2997] [2997] time taken= 0.8693573474884033 | steps= 8 | cpu= 29.3 | ram= 28.389703885656044 | gpu= [2997] [2997] time taken= 0.8704898357391357 | steps= 9 | cpu= 29.4 | ram= 28.298690976454438 | gpu= [2997] [2997] time taken= 0.8670341968536377 | steps= 10 | cpu= 29.5 | ram= 28.13385097442091 | gpu= [2997] [2997] time taken= 0.8750414848327637 | steps= 11 | cpu= 29.5 | ram= 27.959884882309396 | gpu= [2997] [2997] time taken= 0.8624210357666016 | steps= 12 | cpu= 29.9 | ram= 27.784356443255188 | gpu= [2997] [2997] time taken= 0.8561670780181885 | steps= 13 | cpu= 29.8 | ram= 27.644241201568796 | gpu= [2997] [2997] time taken= 0.8609695434570312 | steps= 14 | cpu= 29.7 | ram= 27.51883186047002 | gpu= [2997] [2997] time taken= 0.8462607860565186 | steps= 15 | cpu= 29.7 | ram= 27.36641623650461 | gpu= [2997] [2997] time taken= 0.8624782562255859 | steps= 16 | cpu= 29.2 | ram= 27.23760941078441 | gpu= [2997] [2997] time taken= 0.8649694919586182 | steps= 17 | cpu= 29.4 | ram= 27.113514425050127 | gpu= [2997] [2997] time taken= 0.8661544322967529 | steps= 18 | cpu= 29.3 | ram= 27.004993310427178 | gpu= [2997] [2997] time taken= 0.8687705993652344 | steps= 19 | cpu= 29.8 | ram= 26.82090916192486 | gpu= [2997] [2997] time taken= 0.8823645114898682 | steps= 20 | cpu= 29.6 | ram= 26.688630454109777 | gpu= [2997] [2997] time taken= 0.8795809745788574 | steps= 21 | cpu= 29.4 | ram= 26.517987449146226 | gpu= [2997] [2997] time taken= 0.8857841491699219 | steps= 22 | cpu= 29.1 | ram= 26.40289455770082 | gpu= [2997] [2997] time taken= 0.8605339527130127 | steps= 23 | cpu= 29.5 | ram= 26.274509317663572 | gpu= [2997] [2997] time taken= 0.8524265289306641 | steps= 24 | cpu= 29.8 | ram= 26.16445065525575 | gpu= [2997]