nextcloud / server

☁️ Nextcloud server, a safe home for all your data
https://nextcloud.com
GNU Affero General Public License v3.0
27.32k stars 4.06k forks source link

[Bug]: files are written twice during upload #43636

Closed Noodlesalat closed 4 months ago

Noodlesalat commented 8 months ago

⚠️ This issue respects the following points: ⚠️

Bug description

I have encountered the issue that files uploaded via WebDAV are written twice by Nextcloud. The first version is stored in php's upload_tmp_dir and is named phpXXXXXX. The second version is stored in the files target directory and is named $filename.ocTransferId$ID.part.

According to iotop the file is actually written twice, because the write speed to the storage is approximately twice the upload speed.

Steps to reproduce

  1. upload file via WebDAV
  2. check php's upload_tmp_dir and file target directory for duplicate file

Expected behavior

File is either written to the target directory and gets renamed to the target filename after successful upload or file is written to php's upload_tmp_dir and moved to target folder after successful upload.

Installation method

Community Manual installation with Archive

Nextcloud Server version

28

Operating system

Debian/Ubuntu

PHP engine version

PHP 8.2

Web server

Apache (supported)

Database engine version

MariaDB

Is this bug present after an update or on a fresh install?

Updated from a MINOR version (ex. 22.1 to 22.2)

Are you using the Nextcloud Server Encryption module?

Encryption is Disabled

What user-backends are you using?

Configuration report

{
    "system": {
        "instanceid": "***REMOVED SENSITIVE VALUE***",
        "passwordsalt": "***REMOVED SENSITIVE VALUE***",
        "secret": "***REMOVED SENSITIVE VALUE***",
        "trusted_domains": [
            "example.com",
            "www.example.com"
        ],
        "datadirectory": "***REMOVED SENSITIVE VALUE***",
        "overwriteprotocol": "https",
        "overwrite.cli.url": "https:\/\/example.com",
        "dbtype": "mysql",
        "version": "28.0.2.5",
        "dbname": "***REMOVED SENSITIVE VALUE***",
        "dbhost": "***REMOVED SENSITIVE VALUE***",
        "dbport": "",
        "dbtableprefix": "oc_",
        "dbuser": "***REMOVED SENSITIVE VALUE***",
        "dbpassword": "***REMOVED SENSITIVE VALUE***",
        "installed": true,
        "maintenance": false,
        "mail_from_address": "***REMOVED SENSITIVE VALUE***",
        "mail_smtpmode": "smtp",
        "mail_smtpauthtype": "LOGIN",
        "mail_domain": "***REMOVED SENSITIVE VALUE***",
        "memcache.local": "\\OC\\Memcache\\Redis",
        "filelocking.enabled": "true",
        "memcache.distributed": "\\OC\\Memcache\\Redis",
        "memcache.locking": "\\OC\\Memcache\\Redis",
        "redis": {
            "host": "***REMOVED SENSITIVE VALUE***",
            "port": 0,
            "timeout": 0
        },
        "theme": "",
        "loglevel": 2,
        "logtimezone": "Europe\/Berlin",
        "log_rotate_size": "104857600",
        "mail_smtpsecure": "tls",
        "mail_smtphost": "***REMOVED SENSITIVE VALUE***",
        "mail_smtpport": "587",
        "mail_smtpauth": 1,
        "mail_smtpname": "***REMOVED SENSITIVE VALUE***",
        "mail_smtppassword": "***REMOVED SENSITIVE VALUE***",
        "mysql.utf8mb4": true,
        "trashbin_retention_obligation": "auto, 14",
        "versions_retention_obligation": "auto, 60",
        "default_phone_region": "DE",
        "trusted_proxies": "***REMOVED SENSITIVE VALUE***",
        "enable_previews": true,
        "enabledPreviewProviders": [
            "OC\\Preview\\PNG",
            "OC\\Preview\\JPEG",
            "OC\\Preview\\GIF",
            "OC\\Preview\\BMP",
            "OC\\Preview\\XBitmap",
            "OC\\Preview\\Movie",
            "OC\\Preview\\PDF",
            "OC\\Preview\\MP3",
            "OC\\Preview\\TXT",
            "OC\\Preview\\MarkDown"
        ],
        "preview_max_x": 1024,
        "preview_max_y": 768,
        "preview_max_scale_factor": 1,
        "skeletondirectory": "",
        "knowledgebaseenabled": false,
        "maintenance_window_start": 1
    }
}

List of activated Apps

Enabled:
  - activity: 2.20.0
  - calendar: 4.6.5
  - cloud_federation_api: 1.11.0
  - comments: 1.18.0
  - contacts: 5.5.1
  - contactsinteraction: 1.9.0
  - dav: 1.29.1
  - federatedfilesharing: 1.18.0
  - federation: 1.18.0
  - files: 2.0.0
  - files_external: 1.20.0
  - files_pdfviewer: 2.9.0
  - files_reminders: 1.1.0
  - files_sharing: 1.20.0
  - files_trashbin: 1.18.0
  - files_versions: 1.21.0
  - firstrunwizard: 2.17.0
  - logreader: 2.13.0
  - lookup_server_connector: 1.16.0
  - mail: 3.5.6
  - nextcloud_announcements: 1.17.0
  - notifications: 2.16.0
  - oauth2: 1.16.3
  - password_policy: 1.18.0
  - photos: 2.4.0
  - privacy: 1.12.0
  - provisioning_api: 1.18.0
  - related_resources: 1.3.0
  - serverinfo: 1.18.0
  - settings: 1.10.1
  - sharebymail: 1.18.0
  - spreed: 18.0.3
  - systemtags: 1.18.0
  - tasks: 0.15.0
  - text: 3.9.1
  - theming: 2.3.0
  - twofactor_backupcodes: 1.17.0
  - updatenotification: 1.18.0
  - viewer: 2.2.0
  - workflowengine: 2.10.0
Disabled:
  - admin_audit: 1.18.0
  - bruteforcesettings: 2.8.0 (installed 2.4.0)
  - circles: 28.0.0-dev (installed 22.1.1)
  - dashboard: 7.8.0 (installed 7.0.0)
  - deck: 1.12.2 (installed 1.12.2)
  - encryption: 2.16.0
  - files_rightclick: 0.15.1 (installed 1.6.0)
  - forms: 4.1.0 (installed 4.1.0)
  - maps: 1.3.1 (installed 1.3.1)
  - notes: 4.9.2 (installed 4.9.2)
  - recommendations: 2.0.0 (installed 0.5.0)
  - support: 1.11.0 (installed 1.0.0)
  - survey_client: 1.16.0 (installed 1.2.0)
  - suspicious_login: 6.0.0
  - twofactor_totp: 10.0.0-beta.2
  - user_ldap: 1.19.0
  - user_status: 1.8.1 (installed 1.0.1)
  - weather_status: 1.8.0 (installed 1.0.0)

Nextcloud Signing status

No errors have been found.

Nextcloud Logs

- Removed due to privacy reasons -

Additional info

The Nextcloud data directory is located on another partition and not directly in the web directory.

kesselb commented 8 months ago

Sounds like https://github.com/nextcloud/server/issues/30843 and https://github.com/nextcloud/server/issues/19682.

phpXXXXXX

Afaik, we cannot force PHP to not store files temporary on disk.

The Nextcloud data directory is located on another partition and not directly in the web directory.

Is the upload_tmp_dir on the same partition?

Noodlesalat commented 8 months ago

Thanks for your answer!

So you mean that the phpXXXXXX file is automatically generated by php during the upload and Nextcloud has no influence on this behaviour? Would it then be possible to prevent the creation of the .part file and only work with the phpXXXXXX file? In other words, after a successful upload to upload_tmp_dir, move it to the target folder in the users data directory.

The upload_tmp_dir is indeed located on the same partition as the data directory (tmp: /mnt/tmp; data: /mnt/cloud).

Noodlesalat commented 8 months ago

I just found out that the issue also occurs with uploads via the Nextcloud client, which uses chunked uploads iirc. In this case, a phpXXXXXX file is created in the upload_tmp_dir and the same file is stored in the folder $DATADIR/$USERNAME/uploads/$ID?/.

This means that an upload is always written to the hard disk twice, especially if the data dir and the upload_tmp_dir are on the same partition.

Therefore I updated the issue titel.

kesselb commented 8 months ago

So you mean that the phpXXXXXX file is automatically generated by php during the upload and Nextcloud has no influence on this behaviour?

This is my current state of knowledge.

Would it then be possible to prevent the creation of the .part file and only work with the phpXXXXXX file?

The information, which tmp file is used for the upload, is available when using https://www.php.net/manual/en/features.file-upload.post-method.php.

We are using another way to process uploads afaik https://www.php.net/manual/en/features.file-upload.put-method.php.

Boc-chi-no commented 7 months ago

This issue significantly impacts my system performance. I set upload_tmp_dir to a large tmpfs disk in hopes of speeding up performance. However, what confuses me is that Nextcloud writes upload chunks from upload_tmp_dir to data_dir, then merges them in the data_dir directory before writing them to the target directory. I don't understand why Nextcloud does this, as it renders my upload_tmp_dir setting ineffective.

skjnldsv commented 5 months ago

This means that an upload is always written to the hard disk twice, especially if the data dir and the upload_tmp_dir are on the same partition.

Are we sure they are not referring to the same inode and are in fact really two different files?

Maybe we can ask @juliushaertl or @icewind1991 if they know more about the handling of upload and if Nextcloud really have any potential impact on how upload_tmp_dir is used :eyes:

Noodlesalat commented 4 months ago

checked with stat -c %i $filename on the "same" file:

root@nextcloud /mnt/tmp # stat -c %i phpEQ8lE7 
49938437
------------------------------
root@nextcloud /mnt/cloud/USERNAME/uploads/193146716 # stat -c %i 00011.ocTransferId594818750.part 
44302342

It seems like the inodes differ. This also correlates with the output of iotop as it is approximately twice my upload speed.

juliusknorr commented 4 months ago

As far as I digged into this in the past this is expected internal php behaviour.

Any access of the php://input stream will create a tmp file if greater than 8k, otherwise the stream data will be hold in memory.

https://github.com/php/php-src/blob/01b3fc03c30c6cb85038250bb5640be3a09c6a32/ext/standard/php_fopen_wrapper.c#L231

There is no way to avoid this as far as I know. PHP is also not exposing the file itself, so we can't work with that.

skjnldsv commented 4 months ago

Closing then, unless someone is able to prove we can control this behaviour, this is a wontfix :(