Open kbaum opened 11 years ago
I usually do analysing via grep and small scripts around it But there's some pre-pre-alpha tool by @oruen - https://github.com/oruen/trailblazer
I have a memory leak and i have successfully dumped the heap but now i really am having a hard time knowing where to begin.
Do you have a way to replicate the leak?
Best way is to find a way to examine a leak in progress and watch HeapDump.count_objects
with your namespace as leak becomes bigger
In this way you usually can figure out what types of objects leak.
Also usually there're some objects like sessions/controllers/requests whose counts are proportional to load you throw at the app, and also which should be cleaned up when load is over.
HeapDump calls gc by default so that what is in count/dump - it is leaked or used at the moment.
Then trace these objects references to some root-objects (globals, class variables etc.), also look for symbol#to_proc (usually used in constructs like arr.map(&:this_is_the_symbol)
) - it tends to leave references to context(VM/env/object self) linger in cache
Regarding HeapDump.count_objects, i dont have a namespace i use for all of my objects. When i do HeapDump.count_objects, i just get something like:
4] pry(main)> puts HeapDump.count_objects
{
"total_slots": 754211,
"free_slots": 103849,
"basic_types": {
"T_OBJECT": 16798,
"T_CLASS": 9724,
"T_MODULE": 1909,
"T_FLOAT": 9,
"T_STRING": 344722,
"T_REGEXP": 3528,
"T_ARRAY": 154334,
"T_HASH": 12191,
"T_STRUCT": 608,
"T_BIGNUM": 14,
"T_FILE": 5,
"T_DATA": 59356,
"T_MATCH": 87,
"T_COMPLEX": 1,
"T_RATIONAL": 77,
"T_NODE": 44205,
"T_ICLASS": 2794
},
"user_types": {
}
}
=> nil
Re: tracing these object references, I am having a hard time understanding how to trace references. I think the problem is i dont fully understand the meaning of all of the fields within the json. How to know if one object references another?
Thanks for your help!
count_objects
cannot determine that for you, as only you know structure of your code
By namespace i mean с++ term, you can think of it as of root module/class For example:
module ThisIsNamespace
class SomeClass
...
end
class SomeAnotherClass
...
end
end
classes will be named ThisIsNamespace::SomeClass
, ThisIsNamespace::SomeAnotherClass
etc., so you can count their instances without naming them all - HeapDump.count_objects([ThisIsNamespace, SomeOtherNamespace])
One object references another if (simplified) it has another object's id stored in it. Most long numbers in dump are ids. Each line contains one object so you can simply grep for target id, this will give the object itself and all objects that reference it.
Unfortunately in current version there's no way to tell a number from id, but you usually know if you use id values along with references (@foo = other_obj.object_id
- will not produce a hard reference, but will show up in dump).
In regards to count_objects, i i have no idea what classes are leaking so my first instinct is to just count them all. How do i know where to start? Why not just have a way to count all objects?
RE: tracing object references. I think the format of the json should allow for generic scripts to make sense of the heap dump without understanding the developer's object model. Imagine how much more usage heap_dump would get if it came with some reusable logic that could analyse your heap for you.
Memprof doesn't work with ruby 1.9+ but have you seen this presentation?
http://www.scribd.com/doc/30739474/Debugging-Ruby-with-MongoDB.
I think the nice thing about memprof is that it produces json that allows for reusable introspection of the data. The example mongo queries within the presentation should work for anyone's heap.
thx!
Unfortunately there's no silver bullet. No one can tell if the object is actually leaked without understanding object model and what the program does in general. Some global variables may store objects for long time and for purpose, while the same behaviour in other cases may be a leak.
The idea is to separate basic types and that of libraries from yours so that you have less noise. And compare what you observe with what you expect. For example - you know that (if you debugging rails of similar) you have a one controller instance per request that should be deleted after request was processed - so make a counter for controllers and see which ones are not deleted. You do not have to find the leak at once, go in steps, isolate parts of program etc.
Are there any example scripts that analyze this heap dump and perhaps produce some type of visualization with a tool like graphviz? The heap_dump looks incredibly useful but I am having a difficult time figuring out how to use it to trace a memory leak.
thx!
-karl