BitOne / php-meminfo

PHP extension to get insight about memory usage
MIT License
1.08k stars 78 forks source link

Take into account unreferencable circular references #83

Open mathieuk opened 6 years ago

mathieuk commented 6 years ago

A source of (perceived) memory leaks will be circular references (especially to objects) that haven't been picked up by the GC yet. php-meminfo should provide insight into objects that aren't currently referencable from the PHP userland but are still active in memory.

Example script where php-meminfo doesn't report the existance of the objects:

<?php

class X {
        function __construct($y) {
                $this->y = $y;
        }
}

class Y {
        function __construct() {
                $this->x = new X($this);
        }

}

function test() {
        return new Y;
}

test();

meminfo_dump(fopen('dump.json', 'w'));

The output will be:

{
  "header" : {
    "memory_usage" : 393120,
    "memory_usage_real" : 2097152,
    "peak_memory_usage" : 428880,
    "peak_memory_usage_real" : 2097152
  },
  "items": {
    "0x7f8ea4260100" : {
        "type" : "array",
        "size" : "72",
        "symbol_name" : "_GET",
        "is_root" : true,
        "frame" : "<GLOBAL>"
,
        "children" : {

        }

    },
    "0x7f8ea4260120" : {
        "type" : "array",
        "size" : "72",
        "symbol_name" : "_POST",
        "is_root" : true,
        "frame" : "<GLOBAL>"
,
        "children" : {

        }

    },
    "0x7f8ea4260140" : {
        "type" : "array",
        "size" : "72",
        "symbol_name" : "_COOKIE",
        "is_root" : true,
        "frame" : "<GLOBAL>"
,
        "children" : {

        }

    },
    "0x7f8ea4260160" : {
        "type" : "array",
        "size" : "72",
        "symbol_name" : "_FILES",
        "is_root" : true,
        "frame" : "<GLOBAL>"
,
        "children" : {

        }

    },
    "0x7f8ea4260180" : {
        "type" : "array",
        "size" : "72",
        "symbol_name" : "argv",
        "is_root" : true,
        "frame" : "<GLOBAL>"
,
        "children" : {
            "0":"0x7f8ea4266008"
        }

    },
    "0x7f8ea4266008" : {
        "type" : "string",
        "size" : "25",
        "is_root" : false

    },
    "0x7f8ea42601a0" : {
        "type" : "integer",
        "size" : "16",
        "symbol_name" : "argc",
        "is_root" : true,
        "frame" : "<GLOBAL>"

    },
    "0x7f8ea42601c0" : {
        "type" : "array",
        "size" : "72",
        "symbol_name" : "_ENV",
        "is_root" : true,
        "frame" : "<GLOBAL>"
,
        "children" : {

        }

    },
    "0x7f8ea42601e0" : {
        "type" : "array",
        "size" : "72",
        "symbol_name" : "_REQUEST",
        "is_root" : true,
        "frame" : "<GLOBAL>"
,
        "children" : {

        }

    },
    "0x7f8ea4260200" : {
        "type" : "array",
        "size" : "72",
        "symbol_name" : "_SERVER",
        "is_root" : true,
        "frame" : "<GLOBAL>"
,
        "children" : {
            "XDG_SESSION_ID":"0x7f8ea4260b00",
            "HOSTNAME":"0x7f8ea4260b20",
            "SELINUX_ROLE_REQUESTED":"0x7f8ea4260b40",
            "TERM":"0x7f8ea4260b60",
            "SHELL":"0x7f8ea4260b80",
            "HISTSIZE":"0x7f8ea4260ba0",
            "SSH_CLIENT":"0x7f8ea4260bc0",
            "PERL5LIB":"0x7f8ea4260be0",
            "SELINUX_USE_CURRENT_RANGE":"0x7f8ea4260c00",
            "QTDIR":"0x7f8ea4260c20",
            "OLDPWD":"0x7f8ea4260c40",
            "QTINC":"0x7f8ea4260c60",
            "PERL_MB_OPT":"0x7f8ea4260c80",
            "SSH_TTY":"0x7f8ea4260ca0",
            "QT_GRAPHICSSYSTEM_CHECKED":"0x7f8ea4260cc0",
            "USER":"0x7f8ea4260ce0",
            "LS_COLORS":"0x7f8ea4260d00",
            "SSH_AUTH_SOCK":"0x7f8ea4260d20",
            "MAIL":"0x7f8ea4260d40",
            "PATH":"0x7f8ea4260d60",
            "PWD":"0x7f8ea4260d80",
            "LANG":"0x7f8ea4260da0",
            "MODULEPATH":"0x7f8ea4260dc0",
            "LOADEDMODULES":"0x7f8ea4260de0",
            "KDEDIRS":"0x7f8ea4260e00",
            "SELINUX_LEVEL_REQUESTED":"0x7f8ea4260e20",
            "HISTCONTROL":"0x7f8ea4260e40",
            "SHLVL":"0x7f8ea4260e60",
            "HOME":"0x7f8ea4260e80",
            "PERL_LOCAL_LIB_ROOT":"0x7f8ea4260ea0",
            "LOGNAME":"0x7f8ea4260ec0",
            "QTLIB":"0x7f8ea4260ee0",
            "SSH_CONNECTION":"0x7f8ea4260f00",
            "LC_CTYPE":"0x7f8ea4260f20",
            "MODULESHOME":"0x7f8ea4260f40",
            "LESSOPEN":"0x7f8ea4260f60",
            "XDG_RUNTIME_DIR":"0x7f8ea4260f80",
            "QT_PLUGIN_PATH":"0x7f8ea4260fa0",
            "PERL_MM_OPT":"0x7f8ea4260fc0",
            "BASH_FUNC_module()":"0x7f8ea4260fe0",
            "_":"0x7f8ea4261000",
            "PHP_SELF":"0x7f8ea4261020",
            "SCRIPT_NAME":"0x7f8ea4261040",
            "SCRIPT_FILENAME":"0x7f8ea4261060",
            "PATH_TRANSLATED":"0x7f8ea4261080",
            "DOCUMENT_ROOT":"0x7f8ea42610a0",
            "REQUEST_TIME_FLOAT":"0x7f8ea42610c0",
            "REQUEST_TIME":"0x7f8ea42610e0",
            "argv":"0x7f8ea4261100",
            "argc":"0x7f8ea4261120"
        }

    },
    "0x7f8ea4260b00" : {
        "type" : "string",
        "size" : "18",
        "is_root" : false

    },
    "0x7f8ea4260b20" : {
        "type" : "string",
        "size" : "37",
        "is_root" : false

    },
    "0x7f8ea4260b40" : {
        "type" : "string",
        "size" : "16",
        "is_root" : false

    },
    "0x7f8ea4260b60" : {
        "type" : "string",
        "size" : "30",
        "is_root" : false

    },
    "0x7f8ea4260b80" : {
        "type" : "string",
        "size" : "25",
        "is_root" : false

    },
    "0x7f8ea4260ba0" : {
        "type" : "string",
        "size" : "20",
        "is_root" : false

    },
    "0x7f8ea4260bc0" : {
        "type" : "string",
        "size" : "33",
        "is_root" : false

    },
    "0x7f8ea4260be0" : {
        "type" : "string",
        "size" : "46",
        "is_root" : false

    },
    "0x7f8ea4260c00" : {
        "type" : "string",
        "size" : "16",
        "is_root" : false

    },
    "0x7f8ea4260c20" : {
        "type" : "string",
        "size" : "33",
        "is_root" : false

    },
    "0x7f8ea4260c40" : {
        "type" : "string",
        "size" : "29",
        "is_root" : false

    },
    "0x7f8ea4260c60" : {
        "type" : "string",
        "size" : "41",
        "is_root" : false

    },
    "0x7f8ea4260c80" : {
        "type" : "string",
        "size" : "50",
        "is_root" : false

    },
    "0x7f8ea4260ca0" : {
        "type" : "string",
        "size" : "26",
        "is_root" : false

    },
    "0x7f8ea4260cc0" : {
        "type" : "string",
        "size" : "17",
        "is_root" : false

    },
    "0x7f8ea4260ce0" : {
        "type" : "string",
        "size" : "23",
        "is_root" : false

    },
    "0x7f8ea4260d00" : {
        "type" : "string",
        "size" : "1725",
        "is_root" : false

    },
    "0x7f8ea4260d20" : {
        "type" : "string",
        "size" : "47",
        "is_root" : false

    },
    "0x7f8ea4260d40" : {
        "type" : "string",
        "size" : "39",
        "is_root" : false

    },
    "0x7f8ea4260d60" : {
        "type" : "string",
        "size" : "154",
        "is_root" : false

    },
    "0x7f8ea4260d80" : {
        "type" : "string",
        "size" : "31",
        "is_root" : false

    },
    "0x7f8ea4260da0" : {
        "type" : "string",
        "size" : "27",
        "is_root" : false

    },
    "0x7f8ea4260dc0" : {
        "type" : "string",
        "size" : "63",
        "is_root" : false

    },
    "0x7f8ea4260de0" : {
        "type" : "string",
        "size" : "16",
        "is_root" : false

    },
    "0x7f8ea4260e00" : {
        "type" : "string",
        "size" : "20",
        "is_root" : false

    },
    "0x7f8ea4260e20" : {
        "type" : "string",
        "size" : "16",
        "is_root" : false

    },
    "0x7f8ea4260e40" : {
        "type" : "string",
        "size" : "26",
        "is_root" : false

    },
    "0x7f8ea4260e60" : {
        "type" : "string",
        "size" : "17",
        "is_root" : false

    },
    "0x7f8ea4260e80" : {
        "type" : "string",
        "size" : "29",
        "is_root" : false

    },
    "0x7f8ea4260ea0" : {
        "type" : "string",
        "size" : "36",
        "is_root" : false

    },
    "0x7f8ea4260ec0" : {
        "type" : "string",
        "size" : "23",
        "is_root" : false

    },
    "0x7f8ea4260ee0" : {
        "type" : "string",
        "size" : "37",
        "is_root" : false

    },
    "0x7f8ea4260f00" : {
        "type" : "string",
        "size" : "43",
        "is_root" : false

    },
    "0x7f8ea4260f20" : {
        "type" : "string",
        "size" : "21",
        "is_root" : false

    },
    "0x7f8ea4260f40" : {
        "type" : "string",
        "size" : "34",
        "is_root" : false

    },
    "0x7f8ea4260f60" : {
        "type" : "string",
        "size" : "41",
        "is_root" : false

    },
    "0x7f8ea4260f80" : {
        "type" : "string",
        "size" : "30",
        "is_root" : false

    },
    "0x7f8ea4260fa0" : {
        "type" : "string",
        "size" : "61",
        "is_root" : false

    },
    "0x7f8ea4260fc0" : {
        "type" : "string",
        "size" : "48",
        "is_root" : false

    },
    "0x7f8ea4260fe0" : {
        "type" : "string",
        "size" : "57",
        "is_root" : false

    },
    "0x7f8ea4261000" : {
        "type" : "string",
        "size" : "28",
        "is_root" : false

    },
    "0x7f8ea4261020" : {
        "type" : "string",
        "size" : "25",
        "is_root" : false

    },
    "0x7f8ea4261040" : {
        "type" : "string",
        "size" : "25",
        "is_root" : false

    },
    "0x7f8ea4261060" : {
        "type" : "string",
        "size" : "25",
        "is_root" : false

    },
    "0x7f8ea4261080" : {
        "type" : "string",
        "size" : "25",
        "is_root" : false

    },
    "0x7f8ea42610a0" : {
        "type" : "string",
        "size" : "16",
        "is_root" : false

    },
    "0x7f8ea42610c0" : {
        "type" : "float",
        "size" : "16",
        "is_root" : false

    },
    "0x7f8ea42610e0" : {
        "type" : "integer",
        "size" : "16",
        "is_root" : false

    },
    "0x7f8ea4261100" : {
        "type" : "array",
        "size" : "72",
        "is_root" : false
,
        "children" : {
            "0":"0x7f8ea4266008"
        }

    },
    "0x7f8ea4261120" : {
        "type" : "integer",
        "size" : "16",
        "is_root" : false

    }
}
}
mathieuk commented 6 years ago

I have a prototype change available (https://github.com/mathieuk/php-meminfo/tree/browse_object_store) that will somewhat detect that an instance of Y and X still exist in memory at the time. The output for now would be:

$ php -dextension=meminfo.so test.php
Found unseen alive object: X #2 (0x7fb51a465820)
Found unseen alive object: Y #1 (0x7fb51a4657d0)

And in the output of meminfo_dump():

"0x7fb51a465820":   {
        "class" : "X",
        "is_root" : false,
        "frame" : "<OBJECTS_IN_OBJECT_STORE>",
        "object_handle" : "2",
        "type" : "object",
        "size" : "56"
} ,
"0x7fb51a4657d0":   {
        "class" : "Y",
        "is_root" : false,
        "frame" : "<OBJECTS_IN_OBJECT_STORE>",
        "object_handle" : "1",
        "type" : "object",
        "size" : "56"
}

Right now it doesn't add the properties of the objects yet, because I haven't figured out how to do that. But would you like to add this in this way?

I think it would be interesting to, say, overload the NEW operator and keep track of where objects are instantiated to give the user a fighting chance of determining where their cycles are originating.

BitOne commented 6 years ago

Hey @mathieuk ,

Can you elaborate on what would be your use case? The objects that have circular reference but that will be collected by the Garbage Collector don't constitute a memory leak, as the memory is going to be cleanup. So dumping them will not help understand memory leaks.

By the way, what you describe used to be the behavior of PHP Meminfo: dumping the content of the object store (https://github.com/BitOne/php-meminfo/blob/v0.1.0/meminfo.c#L132 for history).

This had several drawbacks:

These are the reasons why I decided to change the exploration way by going through all execution frame + all statically defined variable.

mathieuk commented 6 years ago

Hi @BitOne,

I've had a few brush-ins with circular references and when you're using frameworks (like Laravel, which I'm using) it can be a real hassle to find out where things are going wrong. In my case, where database connections were being kept open in a job worker despite code trying to close it, it turned out that the Laravel database class has a method getDoctrineConnection() that I called somewhere, which calls into Doctrine\DBAL\Connection - which creates a circular reference to itself. I actually wrapped those calls with gc_enable() and gc_collect_cycles() but it never collected it.

It took me forever to find that.

So, my goal was to improve that situation by giving the developer more information. I figured that by detecting all objects that haven't been seen in the stackframes/static properties yet you'd atleast see that there is (potential for) a circular reference that can cause issues. But, that isn't quite enough: as you still have little information about why this is happening.

So, to detect this, you would have to know where references are happening and which are still alive at a certain point. I've been thinking I could maybe look into overloading the ASSIGN, ASSIGN_DIM, ASSIGN_OBJ opcodes and keep track of (object) assignments. It'd be a big slow down for the code; but I suppose that's OK in this debugging situation. By finding the collected references not in visited_items I think could you could then detect dangling references which would be interesting to inspect when you have that issue.

I suppose this is getting a bit out of the meminfo_dump() scope; though it might still fit in the extension as a different feature.

What do you think? Is this something you think'd be useful to have in meminfo?

BitOne commented 5 years ago

Hi @mathieuk ,

Thanks for the detailed explanation. So if I understand correctly, you had a situation with a circular reference between objects without any living reference (exec frame or static) that was not collected by the garbage collector.

If that's the case, the problem is in the PHP garbage collector itself: it's a bug of the Zend PHP engine.

And in this case, this falls outside of PHP Meminfo scope, as it's not anymore a memory leak due to the PHP program, but a bug of the PHP Zend Engine.

But maybe PHP Meminfo doesn't have all the proper entry points yet, and this was not a dangling reference, but something that PHP Meminfo did not detect.

But I think I'm getting convinced by the idea of having something to dump all objects in memory, using the previous code ;) . At least to debug PHP Meminfo itself: if a lot of objects appears in the "non-accessible" objects, then something is missing. Maybe as a different part of the output format, by providing an option to meminfo_dump?

And by the way, if you have a working example of your use case that you can share, that would be wonderful ;)

muglug commented 5 years ago

And by the way, if you have a working example of your use case that you can share, that would be wonderful ;)

Hey - I have a use-case that may be of interest - a project where gc_disable() is called at the beginning of execution because the cycle counter slows stuff down, and it's very important that the application (a static analysis tool) is fast, especially when running as a language server when used with an IDE.

I've just spent the last six or so hours tracking down memory leaks with the help of this package. The first 4 hours were relatively fruitless, but then I added the suggested improvement from @mathieuk (to identify reference-cycle leaks) and the last two hours have been much more productive!

Nevertheless, you've built a wonderful tool!