Open whbogado opened 2 months ago
To further investigate the issue I've tried:
$ occ files:get -vvv /cogeti/files/TESTE-COTED-CT/eLearningSuite_6_1_LS12.7z /tmp/eLearningSuite_6_1_LS12.7z
The 5GB file was download successfully in a few seconds.
Then:
$ occ files:put -vvv /tmp/eLearningSuite_6_1_LS12.7z /cogeti/files/COTED-CT/eLearningSuite_6_1_LS12.7z
failed with:
In S3ObjectTrait.php line 130:
[OCA\DAV\Connector\Sabre\Exception\BadGateway]
Error while uploading to S3 bucket
Exception trace:
at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Files/ObjectStore/S3ObjectTrait.php:130
OC\Files\ObjectStore\S3->writeMultiPart() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Files/ObjectStore/S3ObjectTrait.php:155
OC\Files\ObjectStore\S3->writeObject() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Files/ObjectStore/ObjectStoreStorage.php:520
OC\Files\ObjectStore\ObjectStoreStorage->writeStream() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Files/ObjectStore/ObjectStoreStorage.php:446
OC\Files\ObjectStore\ObjectStoreStorage->writeBack() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Files/ObjectStore/ObjectStoreStorage.php:354
OC\Files\ObjectStore\ObjectStoreStorage->OC\Files\ObjectStore\{closure}() at n/a:n/a
call_user_func() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/apps/files_external/3rdparty/icewind/streams/src/CallbackWrapper.php:117
Icewind\Streams\CallbackWrapper->stream_close() at n/a:n/a
fclose() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Files/View.php:623
OC\Files\View->file_put_contents() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Files/Node/Folder.php:167
OC\Files\Node\Folder->newFile() at n/a:n/a
call_user_func_array() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Files/Node/LazyFolder.php:64
OC\Files\Node\LazyFolder->__call() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Files/Node/LazyFolder.php:443
OC\Files\Node\LazyFolder->newFile() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/apps/files/lib/Command/Put.php:63
OCA\Files\Command\Put->execute() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/symfony/console/Command/Command.php:298
Symfony\Component\Console\Command\Command->run() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/symfony/console/Application.php:1040
Symfony\Component\Console\Application->doRunCommand() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/symfony/console/Application.php:301
Symfony\Component\Console\Application->doRun() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/symfony/console/Application.php:171
Symfony\Component\Console\Application->run() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Console/Application.php:183
OC\Console\Application->run() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/console.php:87
require_once() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/occ:11
In AbstractUploadManager.php line 134:
[Aws\S3\Exception\S3MultipartUploadException]
An exception occurred while uploading parts to a multipart upload. The following parts had errors:
- Part 6: Error executing "UploadPart" on "https://s3.sa-east-1.amazonaws.com/drive.ct.utfpr.edu.br/urn%3Aoid%3A3441346?partNumber=6&uploadId=LkqX4
WnUnN.SENJrjXIGkP3i.9MIDxE2y83izP4__53rHTrZI3vjXD637tPq2bj841YUdl_USVFD.ojgB.8C30Zm46QIOMogUR0L6dQQZrc.qFOzklnvP7RhPW3qFZbc"; AWS HTTP error: cURL
error 28: SSL connection timeout (see https://curl.haxx.se/libcurl/c/libcurl-errors.html) for https://s3.sa-east-1.amazonaws.com/drive.ct.utfpr.edu
.br/urn%3Aoid%3A3441346?partNumber=6&uploadId=LkqX4WnUnN.SENJrjXIGkP3i.9MIDxE2y83izP4__53rHTrZI3vjXD637tPq2bj841YUdl_USVFD.ojgB.8C30Zm46QIOMogUR0L6
dQQZrc.qFOzklnvP7RhPW3qFZbc
- Part 7: Error executing "UploadPart" on "https://s3.sa-east-1.amazonaws.com/drive.ct.utfpr.edu.br/urn%3Aoid%3A3441346?partNumber=7&uploadId=LkqX4
WnUnN.SENJrjXIGkP3i.9MIDxE2y83izP4__53rHTrZI3vjXD637tPq2bj841YUdl_USVFD.ojgB.8C30Zm46QIOMogUR0L6dQQZrc.qFOzklnvP7RhPW3qFZbc"; AWS HTTP error: cURL
error 28: SSL connection timeout (see https://curl.haxx.se/libcurl/c/libcurl-errors.html) for https://s3.sa-east-1.amazonaws.com/drive.ct.utfpr.edu
.br/urn%3Aoid%3A3441346?partNumber=7&uploadId=LkqX4WnUnN.SENJrjXIGkP3i.9MIDxE2y83izP4__53rHTrZI3vjXD637tPq2bj841YUdl_USVFD.ojgB.8C30Zm46QIOMogUR0L6
dQQZrc.qFOzklnvP7RhPW3qFZbc
- Part 8: Error executing "UploadPart" on "https://s3.sa-east-1.amazonaws.com/drive.ct.utfpr.edu.br/urn%3Aoid%3A3441346?partNumber=8&uploadId=LkqX4
WnUnN.SENJrjXIGkP3i.9MIDxE2y83izP4__53rHTrZI3vjXD637tPq2bj841YUdl_USVFD.ojgB.8C30Zm46QIOMogUR0L6dQQZrc.qFOzklnvP7RhPW3qFZbc"; AWS HTTP error: cURL
error 28: SSL connection timeout (see https://curl.haxx.se/libcurl/c/libcurl-errors.html) for https://s3.sa-east-1.amazonaws.com/drive.ct.utfpr.edu
.br/urn%3Aoid%3A3441346?partNumber=8&uploadId=LkqX4WnUnN.SENJrjXIGkP3i.9MIDxE2y83izP4__53rHTrZI3vjXD637tPq2bj841YUdl_USVFD.ojgB.8C30Zm46QIOMogUR0L6
dQQZrc.qFOzklnvP7RhPW3qFZbc
- Part 9: Error executing "UploadPart" on "https://s3.sa-east-1.amazonaws.com/drive.ct.utfpr.edu.br/urn%3Aoid%3A3441346?partNumber=9&uploadId=LkqX4
WnUnN.SENJrjXIGkP3i.9MIDxE2y83izP4__53rHTrZI3vjXD637tPq2bj841YUdl_USVFD.ojgB.8C30Zm46QIOMogUR0L6dQQZrc.qFOzklnvP7RhPW3qFZbc"; AWS HTTP error: cURL
error 28: SSL connection timeout (see https://curl.haxx.se/libcurl/c/libcurl-errors.html) for https://s3.sa-east-1.amazonaws.com/drive.ct.utfpr.edu
.br/urn%3Aoid%3A3441346?partNumber=9&uploadId=LkqX4WnUnN.SENJrjXIGkP3i.9MIDxE2y83izP4__53rHTrZI3vjXD637tPq2bj841YUdl_USVFD.ojgB.8C30Zm46QIOMogUR0L6
dQQZrc.qFOzklnvP7RhPW3qFZbc
Exception trace:
at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/aws/aws-sdk-php/src/Multipart/AbstractUploadManager.php:134
Aws\Multipart\AbstractUploadManager->Aws\Multipart\{closure}() at n/a:n/a
Generator->send() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/guzzlehttp/promises/src/Coroutine.php:137
GuzzleHttp\Promise\Coroutine->_handleSuccess() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/guzzlehttp/promises/src/Promise.php:209
GuzzleHttp\Promise\Promise::callHandler() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/guzzlehttp/promises/src/Promise.php:158
GuzzleHttp\Promise\Promise::GuzzleHttp\Promise\{closure}() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/guzzlehttp/promises/src/TaskQueue.php:52
GuzzleHttp\Promise\TaskQueue->run() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/guzzlehttp/guzzle/src/Handler/CurlMultiHandler.php:163
GuzzleHttp\Handler\CurlMultiHandler->tick() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/guzzlehttp/guzzle/src/Handler/CurlMultiHandler.php:189
GuzzleHttp\Handler\CurlMultiHandler->execute() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/guzzlehttp/promises/src/Promise.php:251
GuzzleHttp\Promise\Promise->invokeWaitFn() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/guzzlehttp/promises/src/Promise.php:227
GuzzleHttp\Promise\Promise->waitIfPending() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/guzzlehttp/promises/src/Promise.php:272
GuzzleHttp\Promise\Promise->invokeWaitList() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/guzzlehttp/promises/src/Promise.php:229
GuzzleHttp\Promise\Promise->waitIfPending() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/guzzlehttp/promises/src/Promise.php:272
GuzzleHttp\Promise\Promise->invokeWaitList() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/guzzlehttp/promises/src/Promise.php:229
GuzzleHttp\Promise\Promise->waitIfPending() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/guzzlehttp/promises/src/Promise.php:69
GuzzleHttp\Promise\Promise->wait() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/guzzlehttp/promises/src/Coroutine.php:68
GuzzleHttp\Promise\Coroutine->GuzzleHttp\Promise\{closure}() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/guzzlehttp/promises/src/Promise.php:251
GuzzleHttp\Promise\Promise->invokeWaitFn() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/guzzlehttp/promises/src/Promise.php:227
GuzzleHttp\Promise\Promise->waitIfPending() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/guzzlehttp/promises/src/Promise.php:272
GuzzleHttp\Promise\Promise->invokeWaitList() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/guzzlehttp/promises/src/Promise.php:229
GuzzleHttp\Promise\Promise->waitIfPending() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/guzzlehttp/promises/src/Promise.php:69
GuzzleHttp\Promise\Promise->wait() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/aws/aws-sdk-php/src/Multipart/AbstractUploadManager.php:83
Aws\Multipart\AbstractUploadManager->upload() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Files/ObjectStore/S3ObjectTrait.php:122
OC\Files\ObjectStore\S3->writeMultiPart() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Files/ObjectStore/S3ObjectTrait.php:155
OC\Files\ObjectStore\S3->writeObject() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Files/ObjectStore/ObjectStoreStorage.php:520
OC\Files\ObjectStore\ObjectStoreStorage->writeStream() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Files/ObjectStore/ObjectStoreStorage.php:446
OC\Files\ObjectStore\ObjectStoreStorage->writeBack() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Files/ObjectStore/ObjectStoreStorage.php:354
OC\Files\ObjectStore\ObjectStoreStorage->OC\Files\ObjectStore\{closure}() at n/a:n/a
call_user_func() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/apps/files_external/3rdparty/icewind/streams/src/CallbackWrapper.php:117
Icewind\Streams\CallbackWrapper->stream_close() at n/a:n/a
fclose() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Files/View.php:623
OC\Files\View->file_put_contents() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Files/Node/Folder.php:167
OC\Files\Node\Folder->newFile() at n/a:n/a
call_user_func_array() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Files/Node/LazyFolder.php:64
OC\Files\Node\LazyFolder->__call() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Files/Node/LazyFolder.php:443
OC\Files\Node\LazyFolder->newFile() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/apps/files/lib/Command/Put.php:63
OCA\Files\Command\Put->execute() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/symfony/console/Command/Command.php:298
Symfony\Component\Console\Command\Command->run() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/symfony/console/Application.php:1040
Symfony\Component\Console\Application->doRunCommand() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/symfony/console/Application.php:301
Symfony\Component\Console\Application->doRun() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/symfony/console/Application.php:171
Symfony\Component\Console\Application->run() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Console/Application.php:183
OC\Console\Application->run() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/console.php:87
require_once() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/occ:11
files:put <input> <file>
While running the put command there was a 5GB eLearningSuite_6_1_LS12.7z.part
file in the destination group folder.
Looking at the code it seems clear to me what the problem is. Instead of retrying the upload, a \OCA\DAV\Connector\Sabre\Exception\BadGateway
exception is thrown and the upload is aborted when ANY upload part fails.
It seems that this is very likely to happen for a large file which is uploaded in many parallel parts. Smaller files are much less likely to encounter this situation. Retrying the command MAY eventually succeed, but it has to start the upload from the begging while the S3 API handles the upload of only the failed parts.
protected function writeMultiPart(string $urn, StreamInterface $stream, ?string $mimetype = null): void {
$uploader = new MultipartUploader($this->getConnection(), $stream, [
'bucket' => $this->bucket,
'concurrency' => $this->concurrency,
'key' => $urn,
'part_size' => $this->uploadPartSize,
'params' => [
'ContentType' => $mimetype,
'StorageClass' => $this->storageClass,
] + $this->getSSECParameters(),
]);
try {
$uploader->upload();
} catch (S3MultipartUploadException $e) {
// if anything goes wrong with multipart, make sure that you don´t poison and
// slow down s3 bucket with orphaned fragments
$uploadInfo = $e->getState()->getId();
if ($e->getState()->isInitiated() && (array_key_exists('UploadId', $uploadInfo))) {
$this->getConnection()->abortMultipartUpload($uploadInfo);
}
throw new \OCA\DAV\Connector\Sabre\Exception\BadGateway('Error while uploading to S3 bucket', 0, $e);
}
}
AWS documentation shows how to recover from a partial upload failure: https://docs.aws.amazon.com/sdk-for-php/v3/developer-guide/s3-multipart-upload.html#recovering-from-errors
I've tried this and the upload succeeded even if some parts failed to upload at first. Something like this:
do {
try {
$result = $uploader->upload();
} catch (S3MultipartUploadException $e) {
$uploader = new MultipartUploader($this->getConnection(), $stream, [
'state' => $e->getState(),
]);
}
} while (!isset($result));
Probably some timeout has to be implemented to avoid an endless loop.
⚠️ This issue respects the following points: ⚠️
Bug description
occ files:copy exception while copying large files or folders
Steps to reproduce
Expected behavior
External storage file copied to group folder
Nextcloud Server version
30
Operating system
Debian/Ubuntu
PHP engine version
PHP 8.3
Web server
Apache (supported)
Database engine version
PostgreSQL
Is this bug present after an update or on a fresh install?
Upgraded to a MAJOR version (ex. 28 to 29)
Are you using the Nextcloud Server Encryption module?
Encryption is Disabled
What user-backends are you using?
Configuration report
List of activated Apps
Nextcloud Signing status
Nextcloud Logs
No response
Additional info
I am transferring nextcloud to another server. The new server is configured to use Amazon S3 as primary storage. The other server has 63 S3 external mounts. Each of those S3 mounted folders are to be copied to a corresponding group folder on the new server. As I cannot simply copy files to the new server because S3 primary storage uses a completely different object structure I am trying several copy strategies.
The buckets on the old server are in the same account and region as the destination primary bucket. I am trying the
occ files:copy
command. The copy process is very fast but unfortunately thefiles:copy
aborts the transfer for large (it seem that it cannot copy more than 5GB) files or folders.The old server is running Nextcloud 29.0.5 and the destination is 30.0.0.
Below is an example of what happens when I try to copy a 5GB file (S3 mount) from the old server to the new server (group folder):
While copying I see the
eLearningSuite_6_1_LS12.7z.part
file inside the group folder. The file disappears after the copy aborts.I was able to copy a 2.9GB file between the very same folders in a few seconds but the 5GB file throws the
OCP\Files\NotPermittedException
exception. The user has full permissions on both folders and has an unlimited quota so it cannot be permission or quota related.The same happens when I try to copy an entire folder with more than 5GB total files, even if there are not very large files in the origin folder.
I have also tried to use the GUI an DAV (via curl). In both cases the copy aborts after a few seconds. DAV aborts with a 504 gateway timeout status. BTW, as expected, DAV copy is much slower than the
files:copy
command.