phalcon / cphalcon

High performance, full-stack PHP framework delivered as a C extension.
https://phalcon.io
BSD 3-Clause "New" or "Revised" License
10.79k stars 1.97k forks source link

[BUG]: ORM don't free memory in while loop #14656

Open dxkrs opened 4 years ago

dxkrs commented 4 years ago

Run task file in cli mode, I found the memory will not come down!

<?php
namespace Vokuro\App\Tasks;
use Vokuro\App\Models\LiveRoomModel;

Class MyTestORMTask{
    public function testAction(){
        while(1){
            echo PHP_EOL."a1::".memory_get_usage().PHP_EOL;
            $rand = mt_rand(1,20);
            if($rand>=3){
                echo "--a2::".memory_get_usage().PHP_EOL;
                $this->testorm();
                gc_collect_cycles();
                echo "--a3::".memory_get_usage().PHP_EOL;
            }else{
                gc_collect_cycles();
                echo 'sleep '.$rand.'s'.PHP_EOL;
                sleep($rand);
            }
        }
    }

    public function testorm(){
        $objLiveRoom = LiveRoomModel::findFirst([
            'columns' => 'activity_id',
            'conditions' => 'id = :id:',
            'bind' => ['id'=>10]
        ]);
        unset($objLiveRoom);
    }
}

Run result:

a1::1553400
--a2::1553400
--a3::1724456

a1::1724488
sleep 2s

a1::1724488
--a2::1724456
--a3::1724744

a1::1724776
--a2::1724744
--a3::1725032

a1::1725064
--a2::1725032
--a3::1725320

............(pass some result)

a1::1750792
--a2::1750760
--a3::1751048

a1::1751080
--a2::1751048
--a3::1751336

a1::1751368
--a2::1751336
--a3::1751624

............

------Update the problem------ I found out the reason was that I used Model Events when the db was registered in DI.

$di->setShared('db', function () use ($config, $di) {
    $dbconfig = $config->database;

    $eventsManager = new \Phalcon\Events\Manager();
    $profiler = $di->getProfiler();

    $eventsManager->attach('db', function ($event, $connection) use ($profiler, $di) {
        if ($event->getType() == 'beforeQuery') {
            $profiler->startProfile($connection->getSQLStatement());
        }
        if ($event->getType() == 'afterQuery') {
            $profiler->stopProfile();
            $profile = $profiler->getLastProfile();
            $sql = $profile->getSQLStatement();
            $params = $connection->getSqlVariables();
            (is_array($params) && count($params)) && $params = json_encode($params);
            $executeTime = $profile->getTotalElapsedSeconds();

            $logger = $di->get('logger');
            $logger->write_log("{$sql} {$params} {$executeTime}", 'debug');
        }
    });

    $connection = new \Phalcon\Db\Adapter\Pdo\Mysql(array(
            "host" => $dbconfig['host'],
            "port" => $dbconfig['port'],
            "username" => $dbconfig['username'],
            "password" => $dbconfig['password'],
            "dbname" => $dbconfig['dbname'],
            "charset" => $dbconfig['charset'],
            "options" => [
                PDO::ATTR_EMULATE_PREPARES => FALSE
            ])
    );
    $connection->setEventsManager($eventsManager);

    return $connection;
});

If I remove the part of Events Manager, the memory will be freed. However, I found out an other problem. When I use the ORM like this $objLiveRoom = LiveRoomModel::findFirst($oid), the memory problem same. It must be used like $objLiveRoom = LiveRoomModel::findFirstById($oid), or pass in an array,then the problem does not appear. It is a bug too? I don't kown~ ( :

niden commented 4 years ago

Did you issue gc_enable() at the beginning of your script? Also have a look at this:

https://www.php.net/manual/en/features.gc.collecting-cycles.php

It has helped me quite a bit to understand when to call gc_collect_cycles

dxkrs commented 4 years ago

Did you issue gc_enable() at the beginning of your script? Also have a look at this:

https://www.php.net/manual/en/features.gc.collecting-cycles.php

It has helped me quite a bit to understand when to call gc_collect_cycles

the zend.enable_gc in phpinfo is ON,so I think the gc_enable() is unnecessary. I try to add it but the promble same. If not use ORM, you will see the gc is work even no use gc_collect_cycles() and gc_enable(), such as change the function testorm() content:

function testorm(){
    $u = array();
    for($i=0;$i<2333;$i++){    
        $u[] = str_repeat('1',4069);
    }
}

The memory which used in function would be freed after the function finish and the gc work.

dxkrs commented 4 years ago

I have updated the problem at the top content.

niden commented 4 years ago

@dxkrs Thank you this information is very helpful.

It makes sense that the object is not destroyed or cleared since there is a "live" reference to it i.e. the Events Manager.

I will check the findFirst/findFirstBy with some tests to see if I can figure out the issue.

niden commented 4 years ago

@dxkrs This is really puzzling. findFirstById performs some checks and then calls find on the model.

The check:

https://github.com/phalcon/cphalcon/blob/master/phalcon/Mvc/Model.zep#L4326

and the call is

https://github.com/phalcon/cphalcon/blob/master/phalcon/Mvc/Model.zep#L4430

What we get back from both methods is the same object. Since the resulting object is a model, it is bound to the DI container and it will have references in it to external services (as you saw with the events manager in your test).

I have not been able to find any reference that if removed will reduce the memory or anything like that.

niden commented 4 years ago

Just a thought, you could try ->toArray() when getting the record and use the array only. Can you run this and have a look if memory is reduced/freed properly? If not then the issue is somewhere else and not in the model.

dxkrs commented 4 years ago

We need to test the findFirst($oid) ,not findFirstById($oid), such as $objLiveRoom = LiveRoomModel::findFirst($oid). The promble only be found when I use findFirst($oid)! The other format is OK, such as

 $objLiveRoom = LiveRoomModel::findFirst([
    'conditions' => 'id = :id:',
    'bind' => ['id'=>$oid]
])
dxkrs commented 4 years ago

By the way, we shouldn't register the Model Events when we test the findFirst($oid) because of the Model Events must make the memory promble. So, the other promble is can the Events Manager release memory automatically by itself?

niden commented 4 years ago

The memory will be released when the process ends i.e. the script. If there is a models manager there, it will stay there until the end of the script.

From what you wrote, I checked the code again and from what I see there is a reference to the modelsMetadata - the service that helps the ORM understand what fields we have in a table and what they are. The getPhql method of the Query Builder (which is used internally) accesses that service from the container, so the reference could come from there. The relevant line is here:

https://github.com/phalcon/cphalcon/blob/master/phalcon/Mvc/Model/Query/Builder.zep#L631

Honestly I have no idea regarding the PHP internals and how references are kept etc. That is the only spot that I can see, that when you pass a numeric value, an additional service is accessed.

niden commented 4 years ago

Also I took your code and tried the test on my own. Have a look at this test

https://github.com/niden/cphalcon/blob/T14656-orm-memory/tests/integration/Mvc/XCest.php

and the results do not show memory waste:

https://github.com/niden/cphalcon/blob/T14656-orm-memory/findNumber.txt

(ignore the "addBehavior" message - that was a copy paste error)

runphp commented 4 years ago

I test found leak memory when unused bind params

while (true) {
    // unbind code
    $count = EcmStore::count('add_time > '. (time()- random_int(86400, 100*86400)));
    echo sprintf("%s %s\n", memory_get_usage(), $count);

    /*
     // bind code
    $count = EcmStore::count([
        'add_time > {time:int}',
        'bind' => [
            'time' => time() - random_int(86400, 100 * 86400),
        ],
    ]);
     echo sprintf("%s %s\n", memory_get_usage(), $count);
    */

    /*
     // unbind code
    $resultSet = $container->getShared('modelsManager')->createBuilder()->from(EcmStore::class)
        ->columns('count(*) as count')
        ->where('add_time > ' . (time() - random_int(86400, 100 * 86400)))
        ->getQuery()->execute();
    $count = $resultSet->getFirst()->count;
    echo sprintf("%s %s\n", memory_get_usage(), $count);
    */

    /*
     // bind code
    $resultSet = $container->getShared('modelsManager')->createBuilder()->from(EcmStore::class)
        ->columns('count(*) as count')
        ->where('add_time > {time:int}',['time' => time()- random_int(86400, 100*86400)])
        ->getQuery()->execute();
    $count = $resultSet->getFirst()->count;
      echo sprintf("%s %s\n", memory_get_usage(), $count);
    */
}

when unbind code output

1224528 5
1235200 9
1245840 5
1256480 3
1267120 11
1278080 10
1288720 10
1299360 14
1310320 14
1320960 14
1331600 11
1342240 14
1352880 10
1364160 6
1374800 11
1385440 14
1396720 14
1407360 11
1418000 11

when bind code output

1215176 10
1214800 11
1214800 10
1214800 11
1214800 10
1214800 10
1214800 10
1214800 9
1214800 11
1214800 9
1214800 10
1214800 11
1214800 11
1214800 10
1214800 14
1214800 11
Jurigag commented 4 years ago

About last code - yea that's true. Phalcon internally parses PHQL and stores assosciated SQL query for faster queries later. At least it was like this on phalcon 3.x and phalcon 2.x as far as i know. This is why on unbind you have higher memory consumption on each call.

But im not sure about the first code you posted - it shouldn't really affect memory anyhow. Can you try to add there Phalcon\Mvc\Model\Query::clean() after each find and check the results? Also you could check as well content of Phalcon\Mvc\Model\Query::_irPhqlCache. Maybe for some reason it doesn't work as expected and stores PHQL each time, even no changes.

All this logic for caching happens here - https://github.com/phalcon/cphalcon/blob/v4.0.2/phalcon/Mvc/Model/Query.zep#L2754

sergeyklay commented 4 years ago

Refs: