nextcloud / server

☁️ Nextcloud server, a safe home for all your data
https://nextcloud.com
GNU Affero General Public License v3.0
27.46k stars 4.08k forks source link

[Bug]: occ files:copy fails with large files #48194

Open whbogado opened 2 months ago

whbogado commented 2 months ago

⚠️ This issue respects the following points: ⚠️

Bug description

occ files:copy exception while copying large files or folders

Steps to reproduce

  1. Mount an external S3 storage
  2. Create a group folder
  3. Run occ copy:files to copy a large file or folder from the external storage to group folder

Expected behavior

External storage file copied to group folder

Nextcloud Server version

30

Operating system

Debian/Ubuntu

PHP engine version

PHP 8.3

Web server

Apache (supported)

Database engine version

PostgreSQL

Is this bug present after an update or on a fresh install?

Upgraded to a MAJOR version (ex. 28 to 29)

Are you using the Nextcloud Server Encryption module?

Encryption is Disabled

What user-backends are you using?

Configuration report

{
    "system": {
        "instanceid": "***REMOVED SENSITIVE VALUE***",
        "maintenance": false,
        "passwordsalt": "***REMOVED SENSITIVE VALUE***",
        "secret": "***REMOVED SENSITIVE VALUE***",
        "maintenance_window_start": 4,
        "default_phone_region": "BR",
        "trusted_domains": [
            "localhost",
            "drive.ct.utfpr.edu.br"
        ],
        "trusted_proxies": "***REMOVED SENSITIVE VALUE***",
        "datadirectory": "***REMOVED SENSITIVE VALUE***",
        "skeletondirectory": "\/var\/lib\/nextcloud\/skeleton",
        "templatedirectory": "\/var\/lib\/nextcloud\/templates",
        "dbtype": "pgsql",
        "version": "30.0.0.14",
        "overwrite.cli.url": "https:\/\/drive.ct.utfpr.edu.br\/",
        "htaccess.RewriteBase": "\/",
        "overwriteprotocol": "https",
        "overwritehost": "drive.ct.utfpr.edu.br",
        "dbname": "***REMOVED SENSITIVE VALUE***",
        "dbhost": "***REMOVED SENSITIVE VALUE***",
        "dbport": "",
        "dbtableprefix": "oc_",
        "dbuser": "***REMOVED SENSITIVE VALUE***",
        "dbpassword": "***REMOVED SENSITIVE VALUE***",
        "installed": true,
        "logfile": "\/var\/lib\/nextcloud\/data\/nextcloud.log",
        "objectstore": {
            "class": "\\OC\\Files\\ObjectStore\\S3",
            "arguments": {
                "bucket": "drive.ct.utfpr.edu.br",
                "key": "***REMOVED SENSITIVE VALUE***",
                "secret": "***REMOVED SENSITIVE VALUE***",
                "use_ssl": true,
                "region": "sa-east-1",
                "use_path_style": false
            }
        },
        "theme": "",
        "loglevel": 2,
        "mail_smtpmode": "smtp",
        "mail_smtpauthtype": "PLAIN",
        "mail_sendmailmode": "smtp",
        "mail_from_address": "***REMOVED SENSITIVE VALUE***",
        "mail_domain": "***REMOVED SENSITIVE VALUE***",
        "mail_smtpauth": true,
        "mail_smtphost": "***REMOVED SENSITIVE VALUE***",
        "mail_smtpport": "587",
        "mail_smtpname": "***REMOVED SENSITIVE VALUE***",
        "mail_smtppassword": "***REMOVED SENSITIVE VALUE***",
        "filelocking.enabled": true,
        "memcache.local": "\\OC\\Memcache\\APCu",
        "memcache.distributed": "\\OC\\Memcache\\Redis",
        "memcache.locking": "\\OC\\Memcache\\Redis",
        "redis": {
            "host": "***REMOVED SENSITIVE VALUE***",
            "port": 6379,
            "timeout": 0,
            "password": "***REMOVED SENSITIVE VALUE***",
            "dbindex": 0
        },
        "memcached_servers": [
            [
                "cache.cogeti.ct.internal",
                11211
            ]
        ],
        "diagnostics.logging": true,
        "diagnostics.logging.threshold": 0
    }
}

List of activated Apps

Enabled:
  - activity: 3.0.0
  - admin_audit: 1.20.0
  - bruteforcesettings: 3.0.0
  - circles: 30.0.0-dev
  - cloud_federation_api: 1.13.0
  - comments: 1.20.1
  - contactsinteraction: 1.11.0
  - dashboard: 7.10.0
  - dav: 1.31.1
  - federatedfilesharing: 1.20.0
  - federation: 1.20.0
  - files: 2.2.0
  - files_downloadlimit: 3.0.0
  - files_external: 1.22.0
  - files_pdfviewer: 3.0.0
  - files_reminders: 1.3.0
  - files_sharing: 1.22.0
  - files_trashbin: 1.20.1
  - files_versions: 1.23.0
  - firstrunwizard: 3.0.0
  - groupfolders: 18.0.1
  - logreader: 3.0.0
  - lookup_server_connector: 1.18.0
  - nextcloud_announcements: 2.0.0
  - notifications: 3.0.0
  - oauth2: 1.18.1
  - password_policy: 2.0.0
  - photos: 3.0.2
  - privacy: 2.0.0
  - provisioning_api: 1.20.0
  - recommendations: 3.0.0
  - related_resources: 1.5.0
  - serverinfo: 2.0.0
  - settings: 1.13.0
  - sharebymail: 1.20.0
  - support: 2.0.0
  - survey_client: 2.0.0
  - systemtags: 1.20.0
  - text: 4.1.0
  - theming: 2.5.0
  - twofactor_backupcodes: 1.19.0
  - updatenotification: 1.20.0
  - user_ldap: 1.21.0
  - user_oidc: 6.0.1
  - user_status: 1.10.0
  - viewer: 3.0.0
  - weather_status: 1.10.0
  - webhook_listeners: 1.1.0-dev
  - workflowengine: 2.12.0
Disabled:
  - encryption: 2.18.0
  - suspicious_login: 8.0.0
  - twofactor_nextcloud_notification: 4.0.0
  - twofactor_totp: 12.0.0-dev

Nextcloud Signing status

No errors have been found.

Nextcloud Logs

No response

Additional info

I am transferring nextcloud to another server. The new server is configured to use Amazon S3 as primary storage. The other server has 63 S3 external mounts. Each of those S3 mounted folders are to be copied to a corresponding group folder on the new server. As I cannot simply copy files to the new server because S3 primary storage uses a completely different object structure I am trying several copy strategies.

The buckets on the old server are in the same account and region as the destination primary bucket. I am trying the occ files:copy command. The copy process is very fast but unfortunately the files:copy aborts the transfer for large (it seem that it cannot copy more than 5GB) files or folders.

The old server is running Nextcloud 29.0.5 and the destination is 30.0.0.

Below is an example of what happens when I try to copy a 5GB file (S3 mount) from the old server to the new server (group folder):

$ occ files:copy -vvv /cogeti/files/TESTE-COTED-CT/eLearningSuite_6_1_LS12.7z /cogeti/files/COTED-CT/

In Node.php line 412:

  [OCP\Files\NotPermittedException]
  Could not copy /cogeti/files/TESTE-COTED-CT/eLearningSuite_6_1_LS12.7z to /cogeti/files/COTED-CT/eLearningSuite_6_1_LS12.7z

Below is what happens when I try copy a 5GB file from the 

Exception trace:
  at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Files/Node/Node.php:412
 OC\Files\Node\Node->copy() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/apps/files/lib/Command/Copy.php:112
 OCA\Files\Command\Copy->execute() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/symfony/console/Command/Command.php:298
 Symfony\Component\Console\Command\Command->run() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/symfony/console/Application.php:1040
 Symfony\Component\Console\Application->doRunCommand() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/symfony/console/Application.php:301
 Symfony\Component\Console\Application->doRun() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/symfony/console/Application.php:171
 Symfony\Component\Console\Application->run() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Console/Application.php:183
 OC\Console\Application->run() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/console.php:87
 require_once() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/occ:11

While copying I see the eLearningSuite_6_1_LS12.7z.part file inside the group folder. The file disappears after the copy aborts.

I was able to copy a 2.9GB file between the very same folders in a few seconds but the 5GB file throws the OCP\Files\NotPermittedException exception. The user has full permissions on both folders and has an unlimited quota so it cannot be permission or quota related.

The same happens when I try to copy an entire folder with more than 5GB total files, even if there are not very large files in the origin folder.

I have also tried to use the GUI an DAV (via curl). In both cases the copy aborts after a few seconds. DAV aborts with a 504 gateway timeout status. BTW, as expected, DAV copy is much slower than the files:copy command.

whbogado commented 2 months ago

To further investigate the issue I've tried:

$ occ files:get -vvv /cogeti/files/TESTE-COTED-CT/eLearningSuite_6_1_LS12.7z /tmp/eLearningSuite_6_1_LS12.7z

The 5GB file was download successfully in a few seconds.

Then:

$ occ files:put -vvv /tmp/eLearningSuite_6_1_LS12.7z /cogeti/files/COTED-CT/eLearningSuite_6_1_LS12.7z

failed with:

In S3ObjectTrait.php line 130:

  [OCA\DAV\Connector\Sabre\Exception\BadGateway]
  Error while uploading to S3 bucket

Exception trace:
  at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Files/ObjectStore/S3ObjectTrait.php:130
 OC\Files\ObjectStore\S3->writeMultiPart() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Files/ObjectStore/S3ObjectTrait.php:155
 OC\Files\ObjectStore\S3->writeObject() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Files/ObjectStore/ObjectStoreStorage.php:520
 OC\Files\ObjectStore\ObjectStoreStorage->writeStream() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Files/ObjectStore/ObjectStoreStorage.php:446
 OC\Files\ObjectStore\ObjectStoreStorage->writeBack() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Files/ObjectStore/ObjectStoreStorage.php:354
 OC\Files\ObjectStore\ObjectStoreStorage->OC\Files\ObjectStore\{closure}() at n/a:n/a
 call_user_func() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/apps/files_external/3rdparty/icewind/streams/src/CallbackWrapper.php:117
 Icewind\Streams\CallbackWrapper->stream_close() at n/a:n/a
 fclose() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Files/View.php:623
 OC\Files\View->file_put_contents() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Files/Node/Folder.php:167
 OC\Files\Node\Folder->newFile() at n/a:n/a
 call_user_func_array() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Files/Node/LazyFolder.php:64
 OC\Files\Node\LazyFolder->__call() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Files/Node/LazyFolder.php:443
 OC\Files\Node\LazyFolder->newFile() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/apps/files/lib/Command/Put.php:63
 OCA\Files\Command\Put->execute() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/symfony/console/Command/Command.php:298
 Symfony\Component\Console\Command\Command->run() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/symfony/console/Application.php:1040
 Symfony\Component\Console\Application->doRunCommand() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/symfony/console/Application.php:301
 Symfony\Component\Console\Application->doRun() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/symfony/console/Application.php:171
 Symfony\Component\Console\Application->run() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Console/Application.php:183
 OC\Console\Application->run() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/console.php:87
 require_once() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/occ:11

In AbstractUploadManager.php line 134:

  [Aws\S3\Exception\S3MultipartUploadException]
  An exception occurred while uploading parts to a multipart upload. The following parts had errors:
  - Part 6: Error executing "UploadPart" on "https://s3.sa-east-1.amazonaws.com/drive.ct.utfpr.edu.br/urn%3Aoid%3A3441346?partNumber=6&uploadId=LkqX4
  WnUnN.SENJrjXIGkP3i.9MIDxE2y83izP4__53rHTrZI3vjXD637tPq2bj841YUdl_USVFD.ojgB.8C30Zm46QIOMogUR0L6dQQZrc.qFOzklnvP7RhPW3qFZbc"; AWS HTTP error: cURL
  error 28: SSL connection timeout (see https://curl.haxx.se/libcurl/c/libcurl-errors.html) for https://s3.sa-east-1.amazonaws.com/drive.ct.utfpr.edu
  .br/urn%3Aoid%3A3441346?partNumber=6&uploadId=LkqX4WnUnN.SENJrjXIGkP3i.9MIDxE2y83izP4__53rHTrZI3vjXD637tPq2bj841YUdl_USVFD.ojgB.8C30Zm46QIOMogUR0L6
  dQQZrc.qFOzklnvP7RhPW3qFZbc
  - Part 7: Error executing "UploadPart" on "https://s3.sa-east-1.amazonaws.com/drive.ct.utfpr.edu.br/urn%3Aoid%3A3441346?partNumber=7&uploadId=LkqX4
  WnUnN.SENJrjXIGkP3i.9MIDxE2y83izP4__53rHTrZI3vjXD637tPq2bj841YUdl_USVFD.ojgB.8C30Zm46QIOMogUR0L6dQQZrc.qFOzklnvP7RhPW3qFZbc"; AWS HTTP error: cURL
  error 28: SSL connection timeout (see https://curl.haxx.se/libcurl/c/libcurl-errors.html) for https://s3.sa-east-1.amazonaws.com/drive.ct.utfpr.edu
  .br/urn%3Aoid%3A3441346?partNumber=7&uploadId=LkqX4WnUnN.SENJrjXIGkP3i.9MIDxE2y83izP4__53rHTrZI3vjXD637tPq2bj841YUdl_USVFD.ojgB.8C30Zm46QIOMogUR0L6
  dQQZrc.qFOzklnvP7RhPW3qFZbc
  - Part 8: Error executing "UploadPart" on "https://s3.sa-east-1.amazonaws.com/drive.ct.utfpr.edu.br/urn%3Aoid%3A3441346?partNumber=8&uploadId=LkqX4
  WnUnN.SENJrjXIGkP3i.9MIDxE2y83izP4__53rHTrZI3vjXD637tPq2bj841YUdl_USVFD.ojgB.8C30Zm46QIOMogUR0L6dQQZrc.qFOzklnvP7RhPW3qFZbc"; AWS HTTP error: cURL
  error 28: SSL connection timeout (see https://curl.haxx.se/libcurl/c/libcurl-errors.html) for https://s3.sa-east-1.amazonaws.com/drive.ct.utfpr.edu
  .br/urn%3Aoid%3A3441346?partNumber=8&uploadId=LkqX4WnUnN.SENJrjXIGkP3i.9MIDxE2y83izP4__53rHTrZI3vjXD637tPq2bj841YUdl_USVFD.ojgB.8C30Zm46QIOMogUR0L6
  dQQZrc.qFOzklnvP7RhPW3qFZbc
  - Part 9: Error executing "UploadPart" on "https://s3.sa-east-1.amazonaws.com/drive.ct.utfpr.edu.br/urn%3Aoid%3A3441346?partNumber=9&uploadId=LkqX4
  WnUnN.SENJrjXIGkP3i.9MIDxE2y83izP4__53rHTrZI3vjXD637tPq2bj841YUdl_USVFD.ojgB.8C30Zm46QIOMogUR0L6dQQZrc.qFOzklnvP7RhPW3qFZbc"; AWS HTTP error: cURL
  error 28: SSL connection timeout (see https://curl.haxx.se/libcurl/c/libcurl-errors.html) for https://s3.sa-east-1.amazonaws.com/drive.ct.utfpr.edu
  .br/urn%3Aoid%3A3441346?partNumber=9&uploadId=LkqX4WnUnN.SENJrjXIGkP3i.9MIDxE2y83izP4__53rHTrZI3vjXD637tPq2bj841YUdl_USVFD.ojgB.8C30Zm46QIOMogUR0L6
  dQQZrc.qFOzklnvP7RhPW3qFZbc

Exception trace:
  at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/aws/aws-sdk-php/src/Multipart/AbstractUploadManager.php:134
 Aws\Multipart\AbstractUploadManager->Aws\Multipart\{closure}() at n/a:n/a
 Generator->send() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/guzzlehttp/promises/src/Coroutine.php:137
 GuzzleHttp\Promise\Coroutine->_handleSuccess() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/guzzlehttp/promises/src/Promise.php:209
 GuzzleHttp\Promise\Promise::callHandler() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/guzzlehttp/promises/src/Promise.php:158
 GuzzleHttp\Promise\Promise::GuzzleHttp\Promise\{closure}() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/guzzlehttp/promises/src/TaskQueue.php:52
 GuzzleHttp\Promise\TaskQueue->run() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/guzzlehttp/guzzle/src/Handler/CurlMultiHandler.php:163
 GuzzleHttp\Handler\CurlMultiHandler->tick() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/guzzlehttp/guzzle/src/Handler/CurlMultiHandler.php:189
 GuzzleHttp\Handler\CurlMultiHandler->execute() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/guzzlehttp/promises/src/Promise.php:251
 GuzzleHttp\Promise\Promise->invokeWaitFn() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/guzzlehttp/promises/src/Promise.php:227
 GuzzleHttp\Promise\Promise->waitIfPending() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/guzzlehttp/promises/src/Promise.php:272
 GuzzleHttp\Promise\Promise->invokeWaitList() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/guzzlehttp/promises/src/Promise.php:229
 GuzzleHttp\Promise\Promise->waitIfPending() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/guzzlehttp/promises/src/Promise.php:272
 GuzzleHttp\Promise\Promise->invokeWaitList() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/guzzlehttp/promises/src/Promise.php:229
 GuzzleHttp\Promise\Promise->waitIfPending() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/guzzlehttp/promises/src/Promise.php:69
 GuzzleHttp\Promise\Promise->wait() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/guzzlehttp/promises/src/Coroutine.php:68
 GuzzleHttp\Promise\Coroutine->GuzzleHttp\Promise\{closure}() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/guzzlehttp/promises/src/Promise.php:251
 GuzzleHttp\Promise\Promise->invokeWaitFn() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/guzzlehttp/promises/src/Promise.php:227
 GuzzleHttp\Promise\Promise->waitIfPending() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/guzzlehttp/promises/src/Promise.php:272
 GuzzleHttp\Promise\Promise->invokeWaitList() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/guzzlehttp/promises/src/Promise.php:229
 GuzzleHttp\Promise\Promise->waitIfPending() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/guzzlehttp/promises/src/Promise.php:69
 GuzzleHttp\Promise\Promise->wait() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/aws/aws-sdk-php/src/Multipart/AbstractUploadManager.php:83
 Aws\Multipart\AbstractUploadManager->upload() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Files/ObjectStore/S3ObjectTrait.php:122
 OC\Files\ObjectStore\S3->writeMultiPart() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Files/ObjectStore/S3ObjectTrait.php:155
 OC\Files\ObjectStore\S3->writeObject() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Files/ObjectStore/ObjectStoreStorage.php:520
 OC\Files\ObjectStore\ObjectStoreStorage->writeStream() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Files/ObjectStore/ObjectStoreStorage.php:446
 OC\Files\ObjectStore\ObjectStoreStorage->writeBack() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Files/ObjectStore/ObjectStoreStorage.php:354
 OC\Files\ObjectStore\ObjectStoreStorage->OC\Files\ObjectStore\{closure}() at n/a:n/a
 call_user_func() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/apps/files_external/3rdparty/icewind/streams/src/CallbackWrapper.php:117
 Icewind\Streams\CallbackWrapper->stream_close() at n/a:n/a
 fclose() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Files/View.php:623
 OC\Files\View->file_put_contents() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Files/Node/Folder.php:167
 OC\Files\Node\Folder->newFile() at n/a:n/a
 call_user_func_array() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Files/Node/LazyFolder.php:64
 OC\Files\Node\LazyFolder->__call() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Files/Node/LazyFolder.php:443
 OC\Files\Node\LazyFolder->newFile() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/apps/files/lib/Command/Put.php:63
 OCA\Files\Command\Put->execute() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/symfony/console/Command/Command.php:298
 Symfony\Component\Console\Command\Command->run() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/symfony/console/Application.php:1040
 Symfony\Component\Console\Application->doRunCommand() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/symfony/console/Application.php:301
 Symfony\Component\Console\Application->doRun() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/3rdparty/symfony/console/Application.php:171
 Symfony\Component\Console\Application->run() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/lib/private/Console/Application.php:183
 OC\Console\Application->run() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/console.php:87
 require_once() at /efs/ecs/drive.ct.utfpr.edu.br/nextcloud/occ:11

files:put <input> <file>

While running the put command there was a 5GB eLearningSuite_6_1_LS12.7z.part file in the destination group folder.

whbogado commented 1 month ago

Looking at the code it seems clear to me what the problem is. Instead of retrying the upload, a \OCA\DAV\Connector\Sabre\Exception\BadGateway exception is thrown and the upload is aborted when ANY upload part fails.

It seems that this is very likely to happen for a large file which is uploaded in many parallel parts. Smaller files are much less likely to encounter this situation. Retrying the command MAY eventually succeed, but it has to start the upload from the begging while the S3 API handles the upload of only the failed parts.

protected function writeMultiPart(string $urn, StreamInterface $stream, ?string $mimetype = null): void {
        $uploader = new MultipartUploader($this->getConnection(), $stream, [
            'bucket' => $this->bucket,
            'concurrency' => $this->concurrency,
            'key' => $urn,
            'part_size' => $this->uploadPartSize,
            'params' => [
                'ContentType' => $mimetype,
                'StorageClass' => $this->storageClass,
            ] + $this->getSSECParameters(),
        ]);

        try {
            $uploader->upload();
        } catch (S3MultipartUploadException $e) {
            // if anything goes wrong with multipart, make sure that you don´t poison and
            // slow down s3 bucket with orphaned fragments
            $uploadInfo = $e->getState()->getId();
            if ($e->getState()->isInitiated() && (array_key_exists('UploadId', $uploadInfo))) {
                $this->getConnection()->abortMultipartUpload($uploadInfo);
            }
            throw new \OCA\DAV\Connector\Sabre\Exception\BadGateway('Error while uploading to S3 bucket', 0, $e);
        }
    }

AWS documentation shows how to recover from a partial upload failure: https://docs.aws.amazon.com/sdk-for-php/v3/developer-guide/s3-multipart-upload.html#recovering-from-errors

I've tried this and the upload succeeded even if some parts failed to upload at first. Something like this:

        do {
            try {
                $result = $uploader->upload();
            } catch (S3MultipartUploadException $e) {
                $uploader = new MultipartUploader($this->getConnection(), $stream, [
                    'state' => $e->getState(),
                ]);
            }
        } while (!isset($result));

Probably some timeout has to be implemented to avoid an endless loop.