FriendsOfPHP / pickle

PHP Extension installer
Other
1.65k stars 88 forks source link

Optimize memory usage when logging output of an executed command. #262

Open givanov2 opened 1 year ago

givanov2 commented 1 year ago

Problem

During command execution, output gets buffered in $out array for further logging. The content of the array is then composed in a single log message.

Use of array leads to unnecessary memory consumption.

Solution

Do not buffer command execution output in an array; instead, compose log message right away.

mlocati commented 1 year ago

Do you have any benchmark about memory consumption of a big string being composed by adding tons of chunks vs a big array of strings?

givanov2 commented 1 year ago

Sure. This is a rather synthetic benchmark:

<?php

echo PHP_VERSION, PHP_EOL;

// Create test file in current working directory; file size will be ~ 10 MB.
$resolveFile = function (): string {
    $target = __DIR__ . '/testfile.txt';
    $fh = fopen($target, 'wb+');
    if (!$fh) {
        exit('Failed to create test file.');
    }
    for ($i = 0; $i <= 1000000; $i++) {
        fwrite($fh, "Line line\n");
    }

    fclose($fh);

    return $target;
};

// Output peak memory usage.
$showMemoryUse = function (): void {
    echo sprintf('%.3f', memory_get_peak_usage(true) / 1024 / 1024), PHP_EOL;
};

$oldWay = function () use ($resolveFile, $showMemoryUse): void {
    // Old way: we gather in array first, then compose the log message.
    $pp = fopen($resolveFile(), 'r');
    $out = [];
    while ($line = fgets($pp, 1024)) {
        $out[] = rtrim($line);
    }
    $log[] = [
        'level' => 2,
        'msg' => implode("\n", $out),
        'hint' => 'hint'
    ];

    echo md5(serialize($log)), PHP_EOL;

    $showMemoryUse();
};

$newWay = function () use ($resolveFile, $showMemoryUse): void {
    // New way: compose the log entry right away.
    $pp = fopen($resolveFile(), 'r');

    $out = '';
    while ($line = fgets($pp, 1024)) {
        $out .= rtrim($line) . "\n";
    }

    $log[] = [
        'level' => 2,
        'msg' => rtrim($out),
        'hint' => 'hint',
    ];

    echo md5(serialize($log)), PHP_EOL;

    unset($out);

    $showMemoryUse();
};

// $newWay should be less memory-intensive.
$newWay();

// $oldWay should be more memory-intensive.
$oldWay();

My output for this very script is:

8.1.5
4c672cb792eb7143ff08dc618f4c71cb
32.617
4c672cb792eb7143ff08dc618f4c71cb
96.805

Consider commenting out invocation of $oldWay or $newWay for "cleaner" test. Note that this script will leave a txt file next to it.

I hope this is sufficient; I'm sorry I can't come with a better test that could be incorporated into the project.

Edit updated $newScript function to be closer to the actual solution.