Closed durack1 closed 7 years ago
@durack1 @lee1043 please post your codes here. I had @lee1043 leak nothing before.
@doutriaux1, it's probably better to keep the issue (posted code, discussion) back in the vcs repo, no? If so which issue #?
Let's start here because I suspect it's user error at this point.
@doutriaux1 the two variants currently running (as it takes soooooooo long to plot these things, they're still running 12hrs later..) are: PCMDI/amipbcs/make_newVsOldDiffs.py and PCMDI/amipbcs/make_newVsOldDiffs-delFudge.py
Both reference data on ocean
@durack1 you had a template object created that you were not clearing, once this is fixed it seems stable. Will test again shortly when I get onsite because remotely it is painful. Might try wirth a mesa version as well just to see.
(nightly) doutriaux1@oceanonly:[Paul]:[1830]> python test_leak.py
libGL error: No matching fbConfigs or visuals found
libGL error: failed to load driver: swrast
libGL error: No matching fbConfigs or visuals found
libGL error: failed to load driver: swrast
UV-CDAT version: 2.10-19-gba336f2
UV-CDAT prefix: /usr/local/anaconda2/envs/nightly
delFudge: False
Background graphics: True
donotstoredisplay: True
00001 processing: 1870-01 sic Time: 81.494 secs; Max mem: 1.497 GB PyObj#: 0113248;
libGL error: No matching fbConfigs or visuals found
libGL error: failed to load driver: swrast
00002 processing: 1870-02 sic Time: 93.325 secs; Max mem: 1.911 GB PyObj#: 0113249;
00003 processing: 1870-03 sic Time: 78.404 secs; Max mem: 1.911 GB PyObj#: 0113005;
00004 processing: 1870-04 sic Time: 78.045 secs; Max mem: 1.911 GB PyObj#: 0113006;
00005 processing: 1870-05 sic Time: 76.214 secs; Max mem: 1.911 GB PyObj#: 0113007;
@durack1 ok switching to mesalib
I get ridiculous speed gain... I do see the memory increasing, I will investigate further
(nightly) doutriaux1@oceanonly:[Paul]:[1831]> python test_leak.py
UV-CDAT version: 2.10-19-gba336f2
UV-CDAT prefix: /usr/local/anaconda2/envs/nightly
delFudge: False
Background graphics: True
donotstoredisplay: True
00001 processing: 1870-01 sic Time: 05.058 secs; Max mem: 1.535 GB PyObj#: 0113186;
00002 processing: 1870-02 sic Time: 05.195 secs; Max mem: 2.023 GB PyObj#: 0113202;
00003 processing: 1870-03 sic Time: 04.915 secs; Max mem: 2.026 GB PyObj#: 0113112;
00004 processing: 1870-04 sic Time: 04.977 secs; Max mem: 2.073 GB PyObj#: 0113133;
00005 processing: 1870-05 sic Time: 05.094 secs; Max mem: 2.122 GB PyObj#: 0113114;
00006 processing: 1870-06 sic Time: 05.198 secs; Max mem: 2.174 GB PyObj#: 0113255;
00007 processing: 1870-07 sic Time: 04.765 secs; Max mem: 2.222 GB PyObj#: 0113196;
00008 processing: 1870-08 sic Time: 04.840 secs; Max mem: 2.273 GB PyObj#: 0113117;
00009 processing: 1870-09 sic Time: 04.438 secs; Max mem: 2.318 GB PyObj#: 0113198;
@doutriaux1 what's the object name, and the command to delete it? I'll add it in and rerun my script/log..
@doutriaux1 I think that a trimmed down version of this script, along with a single timeslice of input data should be added to the test suite.. It seems it's a very sensitive example of timings and memory usage when turning mesalib
on and off.. And any further tweaks to EzTemplate
@durack1 yes once we figured out where the 50Mb per plot come from 😉 @danlipsa I suspect something in isofills?
@doutriaux1 would you like to revisit the "user error" https://github.com/PCMDI/amipbcs/issues/10#issuecomment-324334436 above?
@durack1 @danlipsa, yes it's something in the plotting of isofill/boxfill I will track it today or tomorrow.
@doutriaux1 time to move discussions back into the vcs repo? I think it'll be cleaner to generate a dedicated issue and move comments there
not fully revisiting the user erro, there was some left over templates which were creating additional object which you complained about it. So that was definitely a user error. There is only one textorientatation created per plot now. So that user-error part is nor solved. Need to figure out why contours create such a mem leak now though.
What was the object that required deletion? You didn't answer my https://github.com/PCMDI/amipbcs/issues/10#issuecomment-324389283 above
@doutriaux1 see https://github.com/UV-CDAT/uvcdat/issues/1424#issuecomment-221677132 it may in fact have already been identified by @chaosphere2112
The tool described in https://github.com/UV-CDAT/uvcdat/issues/1424#issuecomment-224293967 may also be really useful in determining the cause and finally sorting out this issue once and for all
@durack1 rightfully pointe that using X we have no more leak!
(nightlyX) doutriaux1@oceanonly:[Paul]:[1896]> python test_leak.py
UV-CDAT version: 2.10-19-gba336f2
UV-CDAT prefix: /usr/local/anaconda2/envs/nightlyX
delFudge: False
Background graphics: True
donotstoredisplay: True
00001 processing: 1870-01 sic Time: 02.280 secs; Max mem: 1.543 GB PyObj#: 0113551;
00002 processing: 1870-02 sic Time: 02.083 secs; Max mem: 2.008 GB PyObj#: 0113254;
00003 processing: 1870-03 sic Time: 01.329 secs; Max mem: 2.038 GB PyObj#: 0113366;
00004 processing: 1870-04 sic Time: 01.057 secs; Max mem: 2.038 GB PyObj#: 0113437;
00005 processing: 1870-05 sic Time: 01.046 secs; Max mem: 2.038 GB PyObj#: 0113327;
00006 processing: 1870-06 sic Time: 01.013 secs; Max mem: 2.038 GB PyObj#: 0113328;
00007 processing: 1870-07 sic Time: 00.985 secs; Max mem: 2.038 GB PyObj#: 0113329;
00008 processing: 1870-08 sic Time: 01.034 secs; Max mem: 2.038 GB PyObj#: 0113330;
00009 processing: 1870-09 sic Time: 00.990 secs; Max mem: 2.038 GB PyObj#: 0113269;
00010 processing: 1870-10 sic Time: 00.975 secs; Max mem: 2.038 GB PyObj#: 0113270;
00011 processing: 1870-11 sic Time: 01.035 secs; Max mem: 2.038 GB PyObj#: 0113271;
00012 processing: 1870-12 sic Time: 01.260 secs; Max mem: 2.038 GB PyObj#: 0113272;
00013 processing: 1871-01 sic Time: 01.047 secs; Max mem: 2.038 GB PyObj#: 0113221;
00014 processing: 1871-02 sic Time: 01.051 secs; Max mem: 2.038 GB PyObj#: 0113111;
00015 processing: 1871-03 sic Time: 01.010 secs; Max mem: 2.038 GB PyObj#: 0113112;
00016 processing: 1871-04 sic Time: 00.969 secs; Max mem: 2.038 GB PyObj#: 0113113;
00017 processing: 1871-05 sic Time: 00.976 secs; Max mem: 2.038 GB PyObj#: 0113225;
00018 processing: 1871-06 sic Time: 01.044 secs; Max mem: 2.038 GB PyObj#: 0113115;
00019 processing: 1871-07 sic Time: 01.200 secs; Max mem: 2.038 GB PyObj#: 0113116;
00020 processing: 1871-08 sic Time: 01.053 secs; Max mem: 2.038 GB PyObj#: 0113117;
00021 processing: 1871-09 sic Time: 01.556 secs; Max mem: 2.038 GB PyObj#: 0113118;
00022 processing: 1871-10 sic Time: 00.971 secs; Max mem: 2.038 GB PyObj#: 0113119;
But is I switched to the mesa-based one I get:
(nightly) doutriaux1@oceanonly:[Paul]:[1897]> python test_leak.py
UV-CDAT version: 2.10-19-gba336f2
UV-CDAT prefix: /usr/local/anaconda2/envs/nightly
delFudge: False
Background graphics: True
donotstoredisplay: True
00001 processing: 1870-01 sic Time: 01.813 secs; Max mem: 1.494 GB PyObj#: 0113365;
00002 processing: 1870-02 sic Time: 01.426 secs; Max mem: 1.961 GB PyObj#: 0113336;
00003 processing: 1870-03 sic Time: 01.177 secs; Max mem: 1.961 GB PyObj#: 0113407;
00004 processing: 1870-04 sic Time: 01.169 secs; Max mem: 1.974 GB PyObj#: 0113297;
00005 processing: 1870-05 sic Time: 01.180 secs; Max mem: 1.987 GB PyObj#: 0113298;
00006 processing: 1870-06 sic Time: 01.175 secs; Max mem: 2.001 GB PyObj#: 0113299;
00007 processing: 1870-07 sic Time: 01.218 secs; Max mem: 2.014 GB PyObj#: 0113300;
00008 processing: 1870-08 sic Time: 01.123 secs; Max mem: 2.027 GB PyObj#: 0113301;
00009 processing: 1870-09 sic Time: 01.394 secs; Max mem: 2.040 GB PyObj#: 0113241;
00010 processing: 1870-10 sic Time: 01.109 secs; Max mem: 2.053 GB PyObj#: 0113242;
00011 processing: 1870-11 sic Time: 01.127 secs; Max mem: 2.066 GB PyObj#: 0113243;
00012 processing: 1870-12 sic Time: 01.093 secs; Max mem: 2.079 GB PyObj#: 0113244;
It is EXACTLY the same python underneath. @danlipsa I think it's either a mesalib
or a VTK-built-against-mesalib
issue. Your thoughts?
@doutriaux1 @danlipsa and to fill in the holes, the reason why I can't use X11 is because with the VNC session (and bg=1
) I get all black/blank pngs..
@doutriaux1 it's also interesting to note that in your example above it's not just a single object all the time, so the transition:
00001 processing: 1870-01 sic Time: 02.280 secs; Max mem: 1.543 GB PyObj#: 0113551;
00002 processing: 1870-02 sic Time: 02.083 secs; Max mem: 2.008 GB PyObj#: 0113254;
00003 processing: 1870-03 sic Time: 01.329 secs; Max mem: 2.038 GB PyObj#: 0113366;
00004 processing: 1870-04 sic Time: 01.057 secs; Max mem: 2.038 GB PyObj#: 0113437;
00005 processing: 1870-05 sic Time: 01.046 secs; Max mem: 2.038 GB PyObj#: 0113327;
00006 processing: 1870-06 sic Time: 01.013 secs; Max mem: 2.038 GB PyObj#: 0113328;
00007 processing: 1870-07 sic Time: 00.985 secs; Max mem: 2.038 GB PyObj#: 0113329;
00008 processing: 1870-08 sic Time: 01.034 secs; Max mem: 2.038 GB PyObj#: 0113330;
00009 processing: 1870-09 sic Time: 00.990 secs; Max mem: 2.038 GB PyObj#: 0113269;
00010 processing: 1870-10 sic Time: 00.975 secs; Max mem: 2.038 GB PyObj#: 0113270;
00011 processing: 1870-11 sic Time: 01.035 secs; Max mem: 2.038 GB PyObj#: 0113271;
00012 processing: 1870-12 sic Time: 01.260 secs; Max mem: 2.038 GB PyObj#: 0113272;
00013 processing: 1871-01 sic Time: 01.047 secs; Max mem: 2.038 GB PyObj#: 0113221;
00014 processing: 1871-02 sic Time: 01.051 secs; Max mem: 2.038 GB PyObj#: 0113111;
00015 processing: 1871-03 sic Time: 01.010 secs; Max mem: 2.038 GB PyObj#: 0113112;
00016 processing: 1871-04 sic Time: 00.969 secs; Max mem: 2.038 GB PyObj#: 0113113;
00017 processing: 1871-05 sic Time: 00.976 secs; Max mem: 2.038 GB PyObj#: 0113225;
00018 processing: 1871-06 sic Time: 01.044 secs; Max mem: 2.038 GB PyObj#: 0113115;
00019 processing: 1871-07 sic Time: 01.200 secs; Max mem: 2.038 GB PyObj#: 0113116;
00020 processing: 1871-08 sic Time: 01.053 secs; Max mem: 2.038 GB PyObj#: 0113117;
00021 processing: 1871-09 sic Time: 01.556 secs; Max mem: 2.038 GB PyObj#: 0113118;
00022 processing: 1871-10 sic Time: 00.971 secs; Max mem: 2.038 GB PyObj#: 0113119;
Some of these have a single additional python object created (the number behind the PyObj#:
entry), whereas other times the numbers are much larger than 1
@durack1 some also have a great number of py obj going down. Probably gc related.
To answer your question you had forgotten to remive the tmpl created in initVCS tmpl = vcs.createtemplate()
@durack1 when plotting s1s instead of s1 and using s1s for the diff instead of s1, which is what you actually meant we get a smaller memory print and the bug takes longer to appear:
UV-CDAT version: 2.10-19-gba336f2
UV-CDAT prefix: /usr/local/anaconda2/envs/nightly
delFudge: False
Background graphics: True
donotstoredisplay: True
00001 processing: 1870-01 sic Time: 02.521 secs; Max mem: 1.039 GB PyObj#: 0113442;
00002 processing: 1870-02 sic Time: 02.421 secs; Max mem: 1.039 GB PyObj#: 0113443;
00003 processing: 1870-03 sic Time: 02.236 secs; Max mem: 1.039 GB PyObj#: 0113444;
00004 processing: 1870-04 sic Time: 02.225 secs; Max mem: 1.039 GB PyObj#: 0113445;
00005 processing: 1870-05 sic Time: 02.227 secs; Max mem: 1.039 GB PyObj#: 0113425;
00006 processing: 1870-06 sic Time: 02.474 secs; Max mem: 1.039 GB PyObj#: 0113391;
00007 processing: 1870-07 sic Time: 02.645 secs; Max mem: 1.039 GB PyObj#: 0113392;
00008 processing: 1870-08 sic Time: 02.649 secs; Max mem: 1.039 GB PyObj#: 0113393;
00009 processing: 1870-09 sic Time: 02.634 secs; Max mem: 1.053 GB PyObj#: 0113394;
00010 processing: 1870-10 sic Time: 02.603 secs; Max mem: 1.097 GB PyObj#: 0113395;
00011 processing: 1870-11 sic Time: 02.626 secs; Max mem: 1.147 GB PyObj#: 0113396;
00012 processing: 1870-12 sic Time: 02.627 secs; Max mem: 1.198 GB PyObj#: 0113397;
00013 processing: 1871-01 sic Time: 02.645 secs; Max mem: 1.249 GB PyObj#: 0113233;
00014 processing: 1871-02 sic Time: 02.655 secs; Max mem: 1.291 GB PyObj#: 0113234;
00015 processing: 1871-03 sic Time: 02.662 secs; Max mem: 1.343 GB PyObj#: 0113235;
@doutriaux1 thanks.. This script is just a dummy/demo for me at the moment, so I don't really care about the png output.. Just was hoping to highlight the timing and memory issues in a repeatable way so that you and @danlipsa could dig in and figure out the persistent solution
I think your comments on the gc
are right, as you can see every 5 iterations or so the total object count drops 20 (4->5) or more, 164 (12->13) above
@doutriaux1 the huge virtual memory footprint is also something that's concerning..
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
66302 durack1 20 0 12.3g 1.4g 72m R 99.9 0.6 1415:55 python make_newVsOldDiffs.py
12.3Gb when the script is generating a bunch of ~100Kb pngs is kinda concerning..
@durack1 I will send you the memory leak fixed script tomorrow. But yes with mesalib you get the memory leak nothing I can do about it right now. I will stop by tomorrow and try to get your png to be non black.
The issues described in UV-CDAT/uvcdat#1424 will be investigated within tweaks generated to solve the reported problems following VCS changes documented in UV-CDAT/vcs#236