aws / aws-sdk-php

Official repository of the AWS SDK for PHP (@awsforphp)
http://aws.amazon.com/sdkforphp
Apache License 2.0
5.99k stars 1.2k forks source link

S3 putObject leaves file handles open #2335

Open desjardinspezmax opened 2 years ago

desjardinspezmax commented 2 years ago

We are putting a file on S3 using putObject. Once the transfer is completed, we are removing the file and try removing the directory. But when we are trying to remove the directory with rmdir, we get an error that the directory is not empty.

We did some searching and we found this bug report having the same behavior than us https://github.com/aws/aws-sdk-php/issues/81

We did the fix suggested by user mtdowling and it fixed our problem.

On the version 3.197.1 of aws-sdk-hp we had no problem. We updated to version 3.198.0 and our code stopped working.

PHP version: PHP 7.4.23 (cli) (built: Sep 3 2021 18:20:46) ( NTS )

        $a_Params = array(
            'Bucket' => CUSTOMER_AWS_BUCKET,
            'Key' => $sFileName,
            'SourceFile' => $sFilePath,
            'StorageClass' => $sStorageClass,
        );

        $objResult = self::$objS3Client->putObject($a_Params);

This is how we patched it to get it working

`
$handle = fopen($sFilePath, 'r');

    $a_Params = array(
        'Bucket' => CUSTOMER_AWS_BUCKET,
        'Key' => $sFileName,
        'Body' => $handle,
        'StorageClass' => $sStorageClass,
    );

    $objResult = self::$objS3Client->putObject($a_Params);

    // Explicitly close the file handle when done
    fclose($handle);`
paulhenri-l commented 2 years ago

Hi there, I ran into this issue as well. But the given workaround only works when the upload is not a multipart one.

When the upload is a multipart one, there are multiple file handles created and they appear to never be closed.

I'd happily make a PR but I am not sure where would be the best place to close these handles.

https://github.com/aws/aws-sdk-php/blob/58fa9d8b522b0afa260299179ff950c783ff0ee1/src/Multipart/AbstractUploader.php#L44:L78

My guess is that they should be closed somewhere after the yield

SamRemis commented 2 years ago

hi there, this is off the top of my head and I would need some more time to look into this to say for sure, but if I recall correctly there's not an ideal place for the file handle to be closed. The file handle still needs to be open when the call to the service is made, so we may have to include a file to be closed in each call and close them individually after a successful response is returned from S3.
If this is right, there's only a short window in the WrappedHttpHandler where we have a request and a response object; we would need to store a reference to the file handler in the request object and close it during that window. The problem there is that we would be adding service specific code in a generalized file, which is something we try to avoid. It requires more investigation for me to figure out the best solution here, if there even is one. It may very well be that we don't actually have the ability to close the file handle since it's being handled actively by guzzle during those calls.
I will look into this more when I have more time

paulhenri-l commented 2 years ago

Yeah the implementation makes it a bit tricky to find the correct place to close these handles. I wish you good luck 😃

In the meantime if anyone faces the same issue here’s a workaround to close the opened file handles

$streams = get_resources('stream');
$uploadedFilePath = '/some/directory/my-file';

foreach ($streams as $stream) {
    $meta = stream_get_meta_data($stream);

    if (str_contains($meta['uri'] ?? '', $uploadedFilePath)) {
        fclose($stream);
    }
}
umesecke commented 2 years ago

We ran into the same problem today. I think the problem lies in the sourceFile middleware (Aws\Middleware::sourceFile). It creates a LazyOpenStream if the command contains a SourceFile but never closes it. I think the middleware should be responsible for closing the stream once the request is finished.

Using putObject with a Body argument works as expected because the caller is responsible for closing the stream.

Kimmax commented 9 months ago

This is still happening like described in #81