Closed zoilomora closed 2 months ago
Running grumphp in parallel mode opens up a separate process for every task you start. There is communication between those 2 processes and that's probably what is taking up the additional MBs of space. I'm not sure if that memory needs to get manually freed.
However, grumphp is just a tool that finishes at some point. At that moment, the memory gets freed nevertheless. Therefore I am not sure if this really is an issue.
So what do you think about this? Is this really a problem or is the problem rather that you need to increase PHP's memory limit in order to get grumphp running on your project?
I understand that if they are separate PHP processes, they should have the memory limit on each process.
I try to have the same memory limits in local as in production.
If the separate processes do not take up more than 32MB, it seems strange to me that all the tasks in the different processes take up more than 215MB.
The current limit is 256 MB and if I include more files the memory is exceeded. However, memory runs out when there is only 1 task left to finish.
I understand that the desired behavior would be: release that memory as tasks finish?
I think the memory goes to the serialized task results. The task result contains the context, which contains the file collection. If not running in parallel this object is passed by reference, but when running in parallel it is serialized for each result. If the amount of files is large (5000 files in my case) and there are many tasks (20 in my case) GrumPHP will run out of memory. I solved it by registering a middleware to replace the file collections with an empty object. I'm not sure if this file collection is used in any way after a task has completed.
I solved it by registering a middleware to replace the file collections with an empty object.
Can you share your solution?
I'm not sure if this file collection is used in any way after a task has completed.
Currently not in this repository. However it's an official extension point, so one might be using that as a feature.
What I'm wondering is: Once the task has been executed in a separate worker, the serialized version is not being used anymore, meaning that it should be garbage collected at that point. So I assume the problem is that the context in the result is the serialized worker context instead of the initial process' context. So it might make sense to swap it back to the original reference, after which garbage collection kicks in?
Can you share your solution?
Sure:
class UnsetFilesMiddleware implements TaskHandlerMiddlewareInterface
{
/**
* Unset files from task results.
*
* @param TaskInterface $task
* @param TaskRunnerContext $runnercontext
* @param callable $next
*
* @return Promise
*/
public function handle(TaskInterface $task, TaskRunnerContext $runnercontext, callable $next): Promise
{
$result = $next($task, $runnercontext);
if ($result instanceof Promise) {
$result->onResolve(
function ($exception, $value): void {
if ($value instanceof TaskResult) {
$property = new ReflectionProperty($value, 'context');
$property->setAccessible(true);
$property->setValue($value, new RunContext(new FilesCollection([])));
}
}
);
}
return $result;
}
}
And then registered in grumphp.yml with:
My\UnsetFilesMiddleware:
tags:
- name: grumphp.task_handler
priority: 500
I'm still using version 1.5.1 of Grumphp, not sure if this also compatible with the newest version.
Can you verify swapping the "serialized" context coming back from the worker with the original context also does the trick?
$property = new ReflectionProperty($value, 'context');
$property->setAccessible(true);
- $property->setValue($value, new RunContext(new FilesCollection([])));
+ $property->setValue($value, $runnerContext);
I'm still using version 1.5.1 of Grumphp, not sure if this also compatible with the newest version.
In 2.0 the async execution system changed but it is still using the context coming back from the worker. So I assume it will have similar issues.
Can you verify swapping the "serialized" context coming back from the worker with the original context also does the trick?
It does :)
@ashokadewit Can you confirm the fix in #1147 would resolve the issue?
2.0.0
~
When executing tasks with
parallel enabled: true
, the memory is not being released and it is exceeding the limit established in PHP.My configuration
Steps to reproduce: At the end of the
vendor/bin/grumphp
file add the following to check memory usage:Run
./vendor/bin/grumphp run
once with each of this options:parallel: false
parallel: true
Result:
When the different tasks are finished, shouldn't the memory be released?
Is this the desired behavior?