Open flaushi opened 3 years ago
You can use the mwmory profiler to find where this memory is https://github.com/arnaud-lb/php-memory-profiler
Wow, I didn't know about this tool, great! However, this is the situation:
the query being executed is this:
return $this->_em->createQuery(
'SELECT s from App\Entity\DataCategory s
WHERE s.deletedAt IS NULL
AND MY_JSON_CONTAINS(s.tags, :tags) = true
ORDER BY s.name'
)
->setParameter('tags', json_encode($tags) )
->getResult();
this should just query the entities and add them to the UnitOfWork, which I clear every regularly. How is it possible that memory is leaked then?
Edit: This is confusing. My code actually fetches many more entities, but like
$inputItem->dc = $this->em->find(DataCategory::class, $inputItem->dc); // not reported or visible in memprof
if ($inputItem->dc instanceof TagDataCategory)
$children = $this->em->getRepository(DataCategory::class)
->getCategoriesWithTags($inputItem->dc->selectedTags); // <--- these are reported by memprof
else
$children = $inputItem->dc->getChildren(); // these are direct ManyToOne associations
Am I guessing correctly that memprof only reports allocations that have not been freed, so that the DQL query is the one which leaks??
Thank you so much for your help!
From the description in the README (emphasis mine):
The extension tracks the allocation and release of memory blocks to report the amount of memory leaked by every function, method, or file in a program.
- Reports non-freed memory at arbitrary points in the program
So, I am speechless. This means then that the Repository method leaks???
I thought when I load an entity through the entitiy manager it is inserted in the UnitofWork which is cleared properly by $em->clear().
I changed the repository method to first load only the ids of suitable entities an then find
them
class DataCategoryRepository extends EntityRepository
{
public function getCategoriesWithTags(array $tags, $prefetchMode = false) : array
{
return array_map(
fn ($id) => $this->_em->find(DataCategory::class, $id),
$this->getCategoryIdsWithTags($tags));
//$where = 'SELECT s from App\Entity\DataCategory s WHERE s.deletedAt IS NULL AND MY_JSON_CONTAINS(s.tags, :tags) = true ORDER BY s.name';
//return $this->_em->createQuery($where)
// ->setParameter('tags', json_encode($tags) )
// ->getResult();
}
public function getCategoryIdsWithTags(array $tags) : array
{
$where = 'SELECT s.id from App\Entity\DataCategory s WHERE s.deletedAt IS NULL AND MY_JSON_CONTAINS(s.tags, :tags) = true ORDER BY s.name';
return array_column(
$this->_em->createQuery($where)
->setParameter('tags', json_encode($tags) )
->getScalarResult(),
'id');
}
again here a new screenshot
so this looks as if the repository method has a leak? Where?
Or could the rest of my code be leaking?
for the sake of completeness:
class JsonContainsCustomDQLFunction extends FunctionNode
{
/** @var Node */
private $second;
/** @var Node */
private $first;
public function getSql(SqlWalker $sqlWalker)
{
$first = $this->first->dispatch($sqlWalker);
$second = $this->second->dispatch($sqlWalker);
if ($sqlWalker->getConnection()->getDatabasePlatform() instanceof PostgreSqlPlatform) {
return "$first @> $second";
} else if ($sqlWalker->getConnection()->getDatabasePlatform() instanceof MySqlPlatform) {
return "JSON_CONTAINS($first, $second)";
} else
throw new QueryException('Platform for JSON_CONTAINS not supported.');
}
public function parse(Parser $parser)
{
$parser->match(Lexer::T_IDENTIFIER);
$parser->match(Lexer::T_OPEN_PARENTHESIS);
$this->first = $parser->StringPrimary();
$parser->match(Lexer::T_COMMA);
$this->second = $parser->StringPrimary();
$parser->match(Lexer::T_CLOSE_PARENTHESIS);
}
}
When you call the repository, since clear
isn't called inside it, more memory is used than before, presumably because of the entity map. Although this fits the definition of a leak, it is intended, but memprof doesn't know about this.
Maybe you could try using https://github.com/BitOne/php-meminfo instead?
I think that instead of showing you what method "leaked" memory, it will show you what objects are taking up so much memory. There is even a guide on hunting down memory leaks:
https://github.com/BitOne/php-meminfo/blob/master/doc/hunting_down_memory_leaks.md
Hope this helps, I haven't had to do this myself before.
Thanks for this direction I will follow it tmorrow.
Anyway the fact that I am calling $em->find(DataCategory::class, $id)
over and over without seeing it in the memprof, but my DQL query with getResult()
being shown makes me wonder.
To conclude this support case:
A) you are not aware of any memleak in queries and getResult
's, right?
B) And it should be possible to "travel" the association graph of entities over millions of jumps without leaking memory, too? (of course with intermediate $em->clear()
's)
C) Both, $em->find(fqcn, $id)
and $em->createQuery()->getResult()
are supposed to return entities that are stored automatically in the entityMap before being returned to me?
Is there an option to get hydrated but unmanaged entities from the entity manager? (I guess no)
I guess it is the same problem describe here: https://stackoverflow.com/questions/26616861/memory-leak-when-executing-doctrine-query-in-loop I am running $em->clear() periodically, and still have the memory leak issue.
so in symfony config/package/doctrine.yaml I have this option:
doctrine:
dbal:
default_connection: main
connections:
main:
logging: false
with logging: false
doctrine does not log queries into the log file but I guess doctrine is keeping logs somewhere in memory that is why I am having memory leak issue.
So the solution is either
--no-debug
option$em->getConnection()->getConfiguration()->getSQLLogger(null);
profiling: false
:
doctrine:
dbal:
default_connection: main
connections:
main:
profiling: false
Otherwise, sql logger Doctrine\DBAL\Logging\DebugStack
is keeping all the queries
Bug Report
Summary
I think there is a memory leak in long running processes.
Current behavior
My memory consumption grows all the time although I keep no reference to to visited nodes and clear the em regularly. I am traversing an object graph using iteration (not recursion). My stack only has the identifiers, not the entities.
How to reproduce
please see my example https://stackoverflow.com/questions/68686479/leaking-memory-while-traversing-an-object-graph/68686896#68686896 here
Expected behavior
I'd expect to get along with no more than a few megabyte memory consumption all the time.