BitOne / php-meminfo

PHP extension to get insight about memory usage
MIT License
1.08k stars 78 forks source link

Objects that are still in memory but not shown by php-meminfo #63

Closed mathieuk closed 6 years ago

mathieuk commented 6 years ago

I tried using php-meminfo in finding out why in a specific case PDO connections were being kept open. PDO connections only close when they end up out of scope without a refcount so I kinda knew where to look.

After addressing some issues ( https://github.com/BitOne/php-meminfo/pull/62 ) and even adding a little object store dumper to this extension (see paste below) I noticed that even though php-meminfo wasn't showing me these instances, they were still actively in memory.

This problem occured in a Laravel+Laravel Doctrine project. I suspect that these PDO instances might be referenced in closures somewhere? But maybe they're referenced in a different call stack? Is there any way we could have php-meminfo actually find references to these objects somehow?

Attachment, overly simple object_store_dump():

PHP_FUNCTION(object_store_dump)
{
        zend_object **obj_ptr, **end, *obj;

        if (EG(objects_store).top <= 1) {
                return;
        }

        end = EG(objects_store).object_buckets + 1;
        obj_ptr = EG(objects_store).object_buckets + EG(objects_store).top;

        do {
                obj_ptr--;
                obj = *obj_ptr;

                if (IS_OBJ_VALID(obj)) {
                        php_printf("object: %s: %d\n", obj->ce->name->val, obj->handle );

                        if (!(GC_FLAGS(obj) & IS_OBJ_DESTRUCTOR_CALLED)) {
                                php_printf("- DESTRUCTOR NOT CALLED\n");
                        }

                        if (!(GC_FLAGS(obj) & IS_OBJ_FREE_CALLED)) {
                                php_printf("- FREE NOT CALLED\n");
                        }

                }
        } while (obj_ptr != end);
}
AD7six commented 6 years ago

I have found this tool to be pretty interesting - but it does seem to miss a lot more than it catches.

I am using php 7.1.17 and applied the linked patch (significant help - thanks :)). With some modifications to the summary cli to illustrate, from a dump of a cli process:

-> analyzer summary start.json 
+---------------------------------------+------------------+-----------------------------+
| Type                                  | Instances Count  | Cumulated Self Size (bytes) |
+---------------------------------------+------------------+-----------------------------+
| memory_usage                          |                  |                    10607152 |
| memory_usage_real                     |                  |                    12582912 |
| peak_memory_usage                     |                  |                    10650960 |
| peak_memory_usage_real                |                  |                    12582912 |
| Itemized                              |                  |                      377841 |
| Unaccounted                           |                  |                    10229311 |
+---------------------------------------+------------------+-----------------------------+
| string                                |             4271 |                      197393 |
| array                                 |             1776 |                      127872 |
| null                                  |              583 |                        9328 |
| integer                               |              291 |                        4656 |
| sfCommandOption                       |              166 |                       11952 |
| boolean                               |              105 |                        1680 |
| sfEventDispatcher                     |               76 |                        5472 |
| sfAnsiColorFormatter                  |               65 |                        4680 |
| sfCommandArgument                     |               56 |                        4032 |
| unknown                               |                9 |                         144 |

By summing the data the summary table contains, it seems that in this test example I was only able to account for 377841 bytes, out of 10607152 - less than 4%. Or am I misrepresenting the data in the output from php-meminfo?

mathieuk commented 6 years ago

The way I understand it, php-meminfo currently works by walking the current call stack and inspecting the symbol tables available for each stack frame. Any array and object that it encounters is inspected in detail. That works for many cases, but I think it wont find things like:

aftabnaveed commented 6 years ago

I am having similar issue in my PHP 7.1.12, will that branch be merged any time soon?

aftabnaveed commented 6 years ago

@AD7six are you able to share you your modifications to the summary?

Thanks.

nathanielks commented 6 years ago

I'm in a similar predicament. The amount of memory reported by analyzer summary dump.json only accounts for 2.9Kb, where memory_usage is 1.2Gb, so something is amiss.

aftabnaveed commented 6 years ago

This project seems to be dead, the author does not respond and it does not even work out of the box with php 7.1 it. The fix suggested by @mathieuk is not merged, but I still doubt its accuracy. I tried it with one of my laravel applications and it for sure was not reporting the info correctly.

BitOne commented 6 years ago

Sorry guys, I was on holidays, this happens ;) I will have a look a the reported problem ASAP.

BitOne commented 6 years ago

@mathieuk is right on his description. The memory managed by the extension, and not linked to a PHP variable, will not get accounted into the summary. By the way, the summary amount of memory don't take into account some internal consumption as well, so it's only accurate for the self size of scalar objects (like string or int). @aftabnaveed : if you have a working example of your Laravel application that you could share, I'm interested.

And by the way, the previous version of this extension use the object store to get all the buckets. But this solution has several shortcomings:

So walking through the execution frame was the best way to get all items really in memory AND still attached to living references.

aftabnaveed commented 6 years ago

@BitOne thanks for your response. Good to see it is still active :-). I was trying to figure out some memory leaks in PHP-PM project, and you can follow up the meminfo output here in this thread.

https://github.com/php-pm/php-pm/issues/382

mathieuk commented 6 years ago

If i recall correctly, with my laravel issue, i had plenty of objects still in the object store that were not reachable through the callstack. So maybe the right way is to take everything in the object store and everything in the callstack ( while taking care not to count the same object twice ) ? That would leave collectible objects - maybe the meminfo ext could enable the gc in RINIT and run gc_collect_cycles() before returning the results from the meminfo function?

Op 7 aug. 2018 om 11:27 heeft Aftab Naveed notifications@github.com het volgende geschreven:

@BitOne thanks for your response. Good to see it is still active :-). I was trying to figure out some memory leaks in PHP-PM project, and you can follow up the meminfo output here in this thread.

php-pm/php-pm#382

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

BitOne commented 6 years ago

@mathieuk : if you get objects from the objets store, you will get as well the objects that are collectible by the GC. And as the point of the extension is to track memory leak, I would say that objects that are collectible, so will be free soon by the GC, are out of scope.

BitOne commented 6 years ago

@aftabnaveed : Thanks, I had a look at the thread on php-pm.

It seems that the issue has been fixed on php-pm side. But the remark from @andig on this thread, about having multiple Application instances. So something is definitely fishy here.

aftabnaveed commented 6 years ago

Yes, it turned out to be a different issue, I guess meminfo was still not reporting it accurately.

BitOne commented 6 years ago

I will try to install Laravel and see if I'm able to reproduce the "multiple application instances" issue.

BitOne commented 6 years ago

Hey @mathieuk and @aftabnaveed ,

Thanks to your input, I have been able to reproduce two issues that seem to be linked to some changes on PHP 7 internals: https://github.com/BitOne/php-meminfo/issues/68 https://github.com/BitOne/php-meminfo/issues/67

These issues don't exist in PHP 5.

The #67 is certainly linked to https://github.com/BitOne/php-meminfo/pull/62. I will try it and merge ASAP.

Thanks for your help people!

BitOne commented 6 years ago

@mathieuk, the initial question boils down to this: extensions use internally a lot of memory. Same thing for the Zend Engine itself (to compile PHP scripts, for the VM execution, etc...).

Some parts of this memory consumption are done through PHP objects. These objects are accessible through the objects store.

But the main part of the memory consumption comes from hashtables, specific structures and scalars that are managed inside these extensions, and so there's no way to access them and dump them.

So maybe there's no point of getting objects unrelated to the PHP program itself, and maybe worse, it could give the false belief that we can provide a full insight on the memory used by the extension, whereas we can only provide a very incomplete view.

Even giving the information of memory used by the whole system confuses people (see https://github.com/BitOne/php-meminfo/issues/63#issuecomment-399576204) for example, where they try to compare it to the memory used by the data manipulated by their program.

So I'm not sure we should go that way. Giving more details in the documentation on why we cannot just sum up the size of each item and find the same amount of memory_get_usage could be a better way.

Or maybe removing altogether this information from the dump, if it's too much confusing.

BitOne commented 6 years ago

I'm closing this issue, as it was the starting point of a lot of different subjects, on 3 different bugs and 1 misunderstanding of the memory usage information. I've added some information in the README.md on this subject: https://github.com/BitOne/php-meminfo#a-lot-of-memory-usage-is-reported-by-the-memory_usage-entry-but-the-cumulative-size-of-the-items-in-the-summary-is-far-lower-that-the-memory-usage

The 3 bugs have been fixed, and I have released a new v1.0.2 version.

@mathieuk : if you can reproduce the same problem as the initially mentioned with this new version, can you create a new issue please?

Ideally, having the possibility to reproduce this bug on my side would help a lot.

Thank you!