nextcloud / documentation

📘 Nextcloud documentation
https://docs.nextcloud.com
Other
505 stars 1.79k forks source link

Explanation why using external storage outside of Nextcloud is unreliable and hints to make it work with limitations #7849

Open PVince81 opened 2 years ago

PVince81 commented 2 years ago

Todos

Goal: the goal is to be able to point people at this explanation without needing to add anything. It needs to be understandable right away and straightforward.

Draft

When mounting an external storage in Nextcloud but making changes in said external storage outside of Nextcloud, Nextcloud will not know about these changes right away because they are not indexed.

This means that desktop clients also will not sync the new changes because they are not aware of them.

1) By default with the option "Check once every access", whenever a user accessed the file tree in the web UI or mobile (or a PROPFIND is made on a deep folder), Nextcloud will rescan the tree level that got accessed and update its index (aka filecache) with the newly found information. Because this is driven by user actions, this can be very random and is not reliable.

2) The other options "Always" or "Never" will also not help because "Always" will check multiple times per PHP request, but still driven by user actions.

The desktop client only access the root of the sync folder because looking deeper would be too expensive. To be able to sync deep changes, it needs to find a trail of changed etags to follow up to the tree level where the changes happened. For this, it requires someone else to detect those changes. As described in 1), this can happen by chance if a user happens to be accessing the folder that contains the change, then the etag changes will be propagated automatically up to the root, so the desktop client will find the trail eventually.

What we want is that Nextcloud is able to automatically detect changes in the deep folder structure.

There are two ways to achieve this:

A) For SMB storages, use the files_external:notify command which listens to change events sent by SMB itself and then tells Nextcloud which tree level needs to be rescanned, which will result in the wanted etag propagation.

B) For non-SMB, setup a cron job that runs "occ files:scan -p /path/to/storage" periodically. This has the drawback that it will rescan the whole storage periodically and the changes are only detected whenever the scan is running, so it can be slow.

Unfortunately most storage kinds do not support notification unlike SMB, so there can be no reliable way of receiving such updates in time.

Limitations: in scenario B) it is not possible for the scanner to detect renamed entries reliably, so it will believe that the old entry was deleted and that a new one was created. This will cause metadata loss because the metadata of the old entry will be deleted. Nextcloud metadata includes shares, tags, comments, activities, etc associated with the entry in question.

PVince81 commented 2 years ago

@schiessle @AndyScherzinger see above as we discussed this before, I hope to get this documented eventually

PVince81 commented 2 years ago
PVince81 commented 2 years ago

cc @icewind1991 in case you think there is anything to add

PVince81 commented 2 years ago