googleapis / google-cloud-php

Google Cloud Client Library for PHP
https://cloud.google.com/php/docs/reference
Apache License 2.0
1.09k stars 436 forks source link

[Storage API] Possible memory leak on json_decode #3956

Closed Vheclis closed 1 year ago

Vheclis commented 3 years ago

Environment details

Hi guys, we have a consumer that consumes from a Kafka topic that is getting increasing amounts of memory on our docker container, after each message. I ran a memory profiling using this extension (https://github.com/arnaud-lb/php-memory-profiler) for PHP, and checked the data using pprof (https://github.com/gperftools/gperftools). I gathered info on the first message consumed and after ten messages, then compared both profiling to see were memory has increased, and a json_decode call was the one who was leaking the most memory, as I'm showing on the image below.

image

After that, i took the time to go to that part of code src/RequestBuilder.php on his constructor (you can see that is there where the main memory is leaking if u see the top square, with name GoogleCloudCoreRequestBuilder__construct) and change so that the use of json_decode got unset/set to null after i used it, just to see if my consumed memory was lower.

For my surprise, it worked. I ran the profiling again, and see how that changed

image

There is memory leak, but so much less and from another source. Before making sure json_decode response objects were being freed, i was getting around 8mb per message on RAM. With the change, its close to 0.5mb maximum.

I tried recreate StorageClient on each message consume handle, but even with that there was no effect on the leak. Do you guys agree that this could be a problem on the API or there is something i could do on my side to fix this?

Below I'm sending the src/RequestBuilder.php altered just for reference, so you guys can see what i changed so that json_decode was freed. I don't think that we should do something like that to fix, was just a test to see if the problem was really there.

<?php

namespace Google\Cloud\Core;

use GuzzleHttp\Psr7;
use GuzzleHttp\Psr7\Request;
use GuzzleHttp\Psr7\Uri;
use Psr\Http\Message\RequestInterface;

/**
 * Builds a PSR7 request from a service definition.
 */
class RequestBuilder
{
    use JsonTrait;
    use UriTrait;

    /**
     * @var string
     */
    private $servicePath;

    /**
     * @var string
     */
    private $baseUri;

    /**
     * @var array
     */
    private $resourceRoot;

    /**
     * @param string $servicePath
     * @param string $baseUri
     * @param array  $resourceRoot [optional]
     */
    public function __construct($servicePath, $baseUri, array $resourceRoot = [])
    {
        $service = $this->loadServiceDefinition($servicePath);
        $this->servicePath = $servicePath;
        $this->resourceRoot = $resourceRoot;

        // Append service definition base path if bare apiEndpoint domain is given.
        if (isset($service['basePath'])) {
            $uriParts = parse_url($baseUri) + ['path' => null];
            if (!$uriParts['path'] || $uriParts['path'] === '/') {
                $uriParts['path'] = $service['basePath'];

                // Recreate the URI from its modified parts and ensure it ends in a single slash.
                $this->baseUri = rtrim((string) Uri::fromParts($uriParts), '/') . '/';

                return;
            }
        }

        $service= null;

        $this->baseUri = rtrim($baseUri, '/') . '/';
    }

    /**
     * Build the request.
     *
     * @param string $resource
     * @param string $method
     * @param array $options [optional]
     * @return RequestInterface
     * @todo complexity high, revisit
     * @todo consider validating against the schemas
     */
    public function build($resource, $method, array $options = [])
    {
        $root = $this->resourceRoot;

        array_push($root, 'resources');
        $root = array_merge($root, explode('.', $resource));
        array_push($root, 'methods', $method);

        $service = $this->loadServiceDefinition($this->servicePath);
        $action = $service;
        foreach ($root as $rootItem) {
            if (!isset($action[$rootItem])) {
                throw new \InvalidArgumentException('Provided path item ' . $rootItem . ' does not exist.');
            }
            $action = $action[$rootItem];
        }

        $path = [];
        $query = [];
        $body = [];

        if (isset($action['parameters'])) {
            foreach ($action['parameters'] as $parameter => $parameterOptions) {
                if ($parameterOptions['location'] === 'path' && array_key_exists($parameter, $options)) {
                    $path[$parameter] = $options[$parameter];
                    unset($options[$parameter]);
                }

                if ($parameterOptions['location'] === 'query' && array_key_exists($parameter, $options)) {
                    $query[$parameter] = $options[$parameter];
                }
            }
        }

        if (isset($service['parameters'])) {
            foreach ($service['parameters'] as $parameter => $parameterOptions) {
                if ($parameterOptions['location'] === 'query' && array_key_exists($parameter, $options)) {
                    $query[$parameter] = $options[$parameter];
                }
            }
        }

        if (isset($action['request'])) {
            $schema = $action['request']['$ref'];

            foreach ($service['schemas'][$schema]['properties'] as $property => $propertyOptions) {
                if (array_key_exists($property, $options)) {
                    $body[$property] = $options[$property];
                }
            }
        }

        $uri = $this->buildUriWithQuery(
            $this->expandUri($this->baseUri . $action['path'], $path),
            $query
        );

        $httpMethod = $action['httpMethod'];
        $action = null;
        $service = null;

        return new Request(
            $httpMethod,
            $uri,
            ['Content-Type' => 'application/json'],
            $body ? $this->jsonEncode($body) : null
        );
    }

    /**
     * @param string $servicePath
     * @return array
     */
    private function loadServiceDefinition($servicePath)
    {
        return $this->jsonDecode(
            file_get_contents($servicePath, true),
            true
        );
    }
}
vishwarajanand commented 1 year ago

@Vheclis I see that this has been raised as a bug for PHP as well. However, I just cannot repro it in my local:

Code to get memory usage for JSON_DECODE ``` php src/a.php namespace Google\Cloud\Test; require_once __DIR__ . './../vendor/autoload.php'; use Google\Cloud\Core\JsonTrait; class Test { use JsonTrait; private $service; public function runTests($testCases) { echo "| JSON_DECODE Count | Memory usage (in MB) | Peak memory usage (in MB)".PHP_EOL; echo "|-------------|---------------------|--------------------|".PHP_EOL; for ($i = 1; $i <= $testCases; $i++) { $this->testMethod(); // shorten output if ($i % ($testCases/10) === 0) { $this->printMem($i); } } echo "JSON_DECODE(..) executed $i times successfully.".PHP_EOL; } private function testMethod() { $servicePath = __DIR__ . '/../../Storage/src/Connection/ServiceDefinition/storage-v1.json'; $this->service = $this->jsonDecode( file_get_contents($servicePath, true), true ); } private function printMem($test_count) { /* Currently used memory */ $mem_usage = memory_get_usage(); /* Peak memory usage */ $mem_peak = memory_get_peak_usage(); echo "| $test_count |". round($mem_usage / (1024*1024))." | ". round($mem_peak / (1024*1024))." |". PHP_EOL; } } (new Test())->runTests(10000); ```
JSON_DECODE Count Memory usage (in MB) Peak memory usage (in MB)
1000 2 3
2000 2 3
3000 2 3
4000 2 3
5000 2 3
6000 2 3
7000 2 3
8000 2 3
9000 2 3
10000 2 3

My environment details:

google-cloud-php/Core % php -v                                                                                                             (git)-[main]-
PHP 8.0.28 (cli) (built: Apr 26 2023 19:05:33) ( NTS )
Copyright (c) The PHP Group
Zend Engine v4.0.28, Copyright (c) Zend Technologies
    with Xdebug v3.1.5, Copyright (c) 2002-2022, by Derick Rethans
google-cloud-php/Core %
vishwarajanand commented 1 year ago

Closing because unable to repro