Open jkyang92 opened 2 years ago
Not sure if you saw #1689, but I think this is a great feature of capture:
use capture for memory leak detection and memory benchmarking!
In other words, I think the first example you have is evidence that there's a leak, but in M2's interpreter.
Also see https://github.com/Macaulay2/M2/issues/1728#issuecomment-750425861, where this was first noticed.
Okay, after messing the code, it seems we are leaking DictionaryList
objects (and consequently dictionaries and all the data they contain). It seems that we keep a list of all dictionaries in expr.d
? (allDictionaries
variable and the record
function) Can a dictionary ever get collected?
Yes, dictionaries don't go away -- that's so the dictionary can be recovered from the frame, which contains an integer called the frame ID. An alternative (and perhaps better) way to do it would be for each frame to contain a pointer to its dictionary.
Oh, I had assumed that when we call collegeGarbage
in capture
, discarded dictionaries get collected! But in reality nearly nothing that is assigned to a symbol is ever collected?! That seems like an oversight that should be fixed.
It would be great to also document how frames work along the way.
The dictionary caching has nothing to do with being assigned to a symbol -- they are cached so the dictionary can be recovered from the frame id, and that could be done away with as I sketched above. (The dictionaries cached contain just symbols, not "symbol closures". It's the symbol closures that point to frames containing symbol values.)
The number of dictionaries is limited -- once you have loaded all the code, no more dictionaries are made. Thus the non-collection of dictionaries is not a memory leak problem.
The dictionary caching has nothing to do with being assigned to a symbol -- they are cached so the dictionary can be recovered from the frame id, and that could be done away with as I sketched above. (The dictionaries cached contain just symbols, not "symbol closures". It's the symbol closures that point to frames containing symbol values.)
Could you please explain or better yet the document this system somewhere on the wiki? Even starting with some common terminology would be good.
The number of dictionaries is limited -- once you have loaded all the code, no more dictionaries are made. Thus the non-collection of dictionaries is not a memory leak problem.
capture
makes new dictionaries every time. If you run hundreds of tests, that's the equivalent of hundreds of new dictionaries.
Incidentally value(String)
also creates new dictionaries.
Could you please explain or better yet the document this system somewhere on the wiki? Even starting with some common terminology would be good.
Recall from LISP the notion of function closure, or simply closure. When one returns a function as a value, that function closure remembers the values of the variables at the time it was created. Here is an example in M2:
i1 : f = x -> () -> x
o1 = f
o1 : FunctionClosure
i2 : g = f 33
o2 = g
o2 : FunctionClosure
i3 : g()
o3 = 33
i4 : h = f 44
o4 = h
o4 : FunctionClosure
i5 : h()
o5 = 44
The code () ->x
is referred to as a function body. The lexical scope of the function body has associated with it a dictionary, whose single symbol is x
, but in general it will contain all the new variables appearing in the body of the function. When the function closures g
and h
are created by calling f
, the function body is paired with a "frame", which is an array of expressions providing the values of the variables in the lexical scope of the body of the function. So g
has a frame containing 33
and h
has a frame containing 44
. The symbol x
contains information that specifies where in the frame its value is stored, and when g
and h
are executed, they look into the frame at that spot to retrieve the value.
Similarly, M2 has "symbol closures". Here is an example:
i1 : f = x -> () -> symbol x
o1 = f
o1 : FunctionClosure
i2 : g = f 33
o2 = g
o2 : FunctionClosure
i3 : g()
o3 = x
o3 : Symbol
i4 : value oo
o4 = 33
i5 : h = f 44
o5 = h
o5 : FunctionClosure
i6 : h()
o6 = x
o6 : Symbol
i7 : value oo
o7 = 44
All the symbols appearing at top level are actually symbol closures in this sense. They contain a symbol and a pointer to the appropriate frame where the value is stored.
The number of dictionaries is limited -- once you have loaded all the code, no more dictionaries are made. Thus the non-collection of dictionaries is not a memory leak problem.
capture
makes new dictionaries every time. If you run hundreds of tests, that's the equivalent of hundreds of new dictionaries.
Ah, thanks for the reminder. So maybe it is time to implement the fix I proposed above. (On the other hand, dictionaries aren't very large, so it may not be urgent.)
Not related to the Dictionary
issue, but the following loop also leaks, independent of the previous issue. (I made record
in expr.d a no-op, most code doesn't need it to actually work).
testCode = ///
R = QQ[x,y,z];
x+1;
assert ( 1 == 1 )
///
scan(1000, i-> (capture(testCode,UserMode=>false);collectGarbage()));
I can test it later but I'm fairly certain it's leaking the polynomial ring on every cycle, because if I replace the first line of the test code with nearly anything else, we don't leak anymore.
I think this causes most tests to leak, since most tests create polynomial rings.
Just to make sure, you're disabling Usermode, right?
Yep, it's right there in the UserMode=>false
. Also, it actually doesn't seem to leak with UserMode=>true
.
I think I figured this out. The problem is how globalFrame
in expr.d
interacts with calls to new Dictionary
. When you call new Dictionary
in M2 you are actually creating a new DictionaryClosure
in the D code. This closure uses globalFrame
, and all values get stored there. The problem is after the dictionary for them goes out of scope, these values can no longer be removed and get collected. And of course the globalFrame
itself will never be collected.
Referencing a comment here:
I've ran some checks, including https://github.com/Macaulay2/M2/blob/048c777dc703365e44c3865b2c1faed698b82b3a/bugs/anton/MEMORY-LEAKS/NAG-leaks.m2 Everything seems to be fine.
BTW, if anyone has ideas on how to improve the memory leak testing (from the frontend) above, please let me know.
_Originally posted by @antonleykin in https://github.com/Macaulay2/M2/pull/2770#discussion_r1133789097_
I wonder if after fixing the problem above, we could use capture to detect memory leaks from the frontend.
capture
seems to have a slow memory leak. Running the following loop on my system increases M2's memory usage by ~0.5MB/s seemingly indefinitely. (I ran it till M2 was using 300MB).The following loop seems to leak much faster (1GB within 15min), but it's harder to figure out what's leaking:
In contrast the following loop does not cause an increase in memory usage:
Valgrind does not seem to detect a leak.