phpro / grumphp

A PHP code-quality tool
MIT License
4.11k stars 429 forks source link

Memory is not being released in parallel execution #1101

Open zoilomora opened 11 months ago

zoilomora commented 11 months ago
Q A
Version 2.0.0
Bug? yes
New feature? no
Question? yes
Documentation? no
Related tickets ~

When executing tasks with parallel enabled: true, the memory is not being released and it is exceeding the limit established in PHP.

My configuration

grumphp:
  process_timeout: 120
  ascii:
    failed:
      - config/hooks/ko.txt
    succeeded:
      - config/hooks/ok.txt
  parallel:
    enabled: true
    max_workers: 32
  tasks:
    composer:
      strict: true
    jsonlint: ~
    phpcpd:
      exclude:
        - 'var'
        - 'vendor'
        - 'tests'
      min_lines: 60
    phpcs:
      standard:
        - 'phpcs.xml.dist'
      whitelist_patterns:
        - '/^src\/(.*)/'
        - '/^tests\/(.*)/'
      encoding: 'UTF-8'
    phplint: ~
    phpstan_shell:
      metadata:
        label: phpstan
        task: shell
      scripts:
        - ["-c", "phpstan analyse -l 9 src"]
    phpunit: ~
    behat:
      config: ~
      format: progress
      stop_on_failure: true
    phpversion:
      project: '8.2'
    securitychecker_local:
      lockfile: ./composer.lock
      format: ~

Steps to reproduce: At the end of the vendor/bin/grumphp file add the following to check memory usage:

$memory = memory_get_usage() / 1024 / 1024;
print_r(round($memory, 3) . ' MB' . PHP_EOL);
exit();

Run ./vendor/bin/grumphp run once with each of this options:

Result:

parallel: false
Used Memory: 32.553 MB

parallel: true
Used Memory: 215.642 MB

When the different tasks are finished, shouldn't the memory be released?

Is this the desired behavior?

veewee commented 11 months ago

Running grumphp in parallel mode opens up a separate process for every task you start. There is communication between those 2 processes and that's probably what is taking up the additional MBs of space. I'm not sure if that memory needs to get manually freed.

However, grumphp is just a tool that finishes at some point. At that moment, the memory gets freed nevertheless. Therefore I am not sure if this really is an issue.

So what do you think about this? Is this really a problem or is the problem rather that you need to increase PHP's memory limit in order to get grumphp running on your project?

zoilomora commented 11 months ago

I understand that if they are separate PHP processes, they should have the memory limit on each process.

I try to have the same memory limits in local as in production.

If the separate processes do not take up more than 32MB, it seems strange to me that all the tasks in the different processes take up more than 215MB.

The current limit is 256 MB and if I include more files the memory is exceeded. However, memory runs out when there is only 1 task left to finish.

I understand that the desired behavior would be: release that memory as tasks finish?

ashokadewit commented 1 week ago

I think the memory goes to the serialized task results. The task result contains the context, which contains the file collection. If not running in parallel this object is passed by reference, but when running in parallel it is serialized for each result. If the amount of files is large (5000 files in my case) and there are many tasks (20 in my case) GrumPHP will run out of memory. I solved it by registering a middleware to replace the file collections with an empty object. I'm not sure if this file collection is used in any way after a task has completed.