owncloud / core

:cloud: ownCloud web server core (Files, DAV, etc.)
https://owncloud.com
GNU Affero General Public License v3.0
8.37k stars 2.06k forks source link

External Storage Offline result in Deletion #11149

Closed Green9332 closed 6 years ago

Green9332 commented 10 years ago

Steps to reproduce

  1. Install OC and Windows Client
  2. Add SFTP external storage already containing data
  3. Wait for sync
  4. Take SFTP server offline
  5. OC Clients will start to delete local folders and files, starting with empty folders. Giving errors on folders already containing files.

    Expected behaviour

If there is an external storage defined and offline, do not assume it has been deleted altogether. Rather assume it is offline and for the time being, simply ignore/skip the referenced files and folders.

Actual behaviour

OC Server seemingly assumes the worst and communicates it as such to the connected clients.

Server configuration

Operating system: Windows 2012R2

Web server: IIS 8.5

Database: MariaDB 10

PHP version: 5.6

ownCloud version: (see ownCloud admin page) ownCloud 7.0.2 (stable)

Updated from an older ownCloud or fresh install: No

List of activated apps: OOTB, just External Storage added

The content of config/config.php: <?php $CONFIG = array ( 'datadirectory' => 'E:\shares\com\oc', 'dbtype' => 'mysql', 'version' => '7.0.2.1', 'dbname' => 'owncloud', 'dbhost' => 'localhost', 'dbtableprefix' => 'oc_', 'installed' => true, 'forcessl' => true, 'loglevel' => '2', );

Are you using external storage, if yes which one: local/smb/sftp/... SFTP

Are you using encryption: yes/no No

Client configuration

Browser: FF, Chrome, IE - in that order

Operating system: Win7Ult OC Client 1.6.3 (build 3721)

Logs

Web server log

2014-09-17 19:17:45 1.2.3.4 PROPFIND /remote.php/webdav/ - 443 - 1.2.3.5 Mozilla/5.0+(Windows)+mirall/1.6.3 - 207 0 0 894 2014-09-17 19:17:50 1.2.3.4 PROPFIND /remote.php/webdav/ - 443 - 1.2.3.6 Mozilla/5.0+(Windows)+mirall/1.6.3 - 207 0 0 2810 2014-09-17 19:17:50 1.2.3.4 PROPFIND /remote.php/webdav/ZAFSS1/Projects - 443 - 1.2.3.5 Mozilla/5.0+(Windows)+mirall/1.6.3 - 404 0 0 2807 2014-09-17 19:17:54 1.2.3.4 PROPFIND /remote.php/webdav/ - 443 - 1.2.3.7 Mozilla/5.0+(Windows)+mirall/1.6.3 - 207 0 0 6037 2014-09-17 19:17:56 1.2.3.4 PROPFIND /remote.php/webdav/ - 443 - 1.2.3.6 Mozilla/5.0+(Windows)+mirall/1.6.3 - 207 0 0 4960 2014-09-17 19:17:56 1.2.3.4 PROPFIND /remote.php/webdav - 443 - 1.2.3.6 Mozilla/5.0+(Windows)+csyncoC/0.91.5+neon/0.30.0 - 401 0 0 359 2014-09-17 19:18:02 1.2.3.4 PROPFIND /remote.php/webdav - 443 - 1.2.3.6 Mozilla/5.0+(Windows)+csyncoC/0.91.5+neon/0.30.0 - 207 0 0 5015 2014-09-17 19:18:05 1.2.3.4 PROPFIND /remote.php/webdav/ - 443 - 1.2.3.7 Mozilla/5.0+(Windows)+mirall/1.6.3 - 207 0 0 11264 2014-09-17 19:18:05 1.2.3.4 PROPFIND /remote.php/webdav - 443 - 1.2.3.7 Mozilla/5.0+(Windows)+csyncoC/0.91.5+neon/0.30.0 - 401 0 0 312 2014-09-17 19:18:22 1.2.3.4 GET /status.php - 443 - 1.2.3.5 Mozilla/5.0+(Windows)+mirall/1.6.3 - 200 0 0 234 2014-09-17 19:18:33 1.2.3.4 PROPFIND /remote.php/webdav/ - 443 - 1.2.3.5 Mozilla/5.0+(Windows)+mirall/1.6.3 - 207 0 0 10645 2014-09-17 19:18:40 1.2.3.4 PROPFIND /remote.php/webdav/ - 443 - 1.2.3.6 Mozilla/5.0+(Windows)+mirall/1.6.3 - 207 0 0 20791 2014-09-17 19:18:43 1.2.3.4 PROPFIND /remote.php/webdav/ - 443 - 1.2.3.5 Mozilla/5.0+(Windows)+mirall/1.6.3 - 207 0 0 10691 2014-09-17 19:18:44 1.2.3.4 PROPFIND /remote.php/webdav - 443 - 1.2.3.7 Mozilla/5.0+(Windows)+csyncoC/0.91.5+neon/0.30.0 - 207 0 0 38300 2014-09-17 19:18:47 1.2.3.4 PROPFIND /remote.php/webdav/ZAFSS1/Projects - 443 - 1.2.3.5 Mozilla/5.0+(Windows)+csyncoC/0.91.5+neon/0.30.0 - 401 0 0 328 2014-09-17 19:18:47 1.2.3.4 PROPFIND /remote.php/webdav/ - 443 - 1.2.3.6 Mozilla/5.0+(Windows)+mirall/1.6.3 - 207 0 0 14465 2014-09-17 19:18:47 1.2.3.4 PROPFIND /remote.php/webdav - 443 - 1.2.3.7 Mozilla/5.0+(Windows)+csyncoC/0.91.5+neon/0.30.0 - 401 0 0 453 2014-09-17 19:18:48 1.2.3.4 PROPFIND /remote.php/webdav/ZAFSS1/Projects - 443 - 1.2.3.5 Mozilla/5.0+(Windows)+csyncoC/0.91.5+neon/0.30.0 - 404 0 0 1046 2014-09-17 19:18:48 1.2.3.4 PROPFIND /remote.php/webdav/ - 443 - 1.2.3.7 Mozilla/5.0+(Windows)+mirall/1.6.3 - 207 0 0 24441 2014-09-17 19:18:50 1.2.3.4 PROPFIND /remote.php/webdav - 443 - 1.2.3.7 Mozilla/5.0+(Windows)+csyncoC/0.91.5+neon/0.30.0 - 207 0 0 2895 2014-09-17 19:18:53 1.2.3.4 PROPFIND /remote.php/webdav/ADC - 443 - 1.2.3.7 Mozilla/5.0+(Windows)+csyncoC/0.91.5+neon/0.30.0 - 207 0 0 3131 2014-09-17 19:18:57 1.2.3.4 PROPFIND /remote.php/webdav/ADC/URLs - 443 - 1.2.3.7 Mozilla/5.0+(Windows)+csyncoC/0.91.5+neon/0.30.0 - 207 0 0 3385 2014-09-17 19:19:00 1.2.3.4 PROPFIND /remote.php/webdav/ADC/Training - 443 - 1.2.3.7 Mozilla/5.0+(Windows)+csyncoC/0.91.5+neon/0.30.0 - 207 0 0 3399 2014-09-17 19:19:03 1.2.3.4 PROPFIND /remote.php/webdav/ADC/Tests - 443 - 1.2.3.7 Mozilla/5.0+(Windows)+csyncoC/0.91.5+neon/0.30.0 - 207 0 0 3328 2014-09-17 19:19:03 1.2.3.4 PROPFIND /remote.php/webdav/ - 443 - 1.2.3.6 Mozilla/5.0+(Windows)+mirall/1.6.3 - 207 0 0 1265 2014-09-17 19:19:05 1.2.3.4 PROPFIND /remote.php/webdav - 443 - 1.2.3.6 Mozilla/5.0+(Windows)+csyncoC/0.91.5+neon/0.30.0 - 401 0 0 328 2014-09-17 19:19:06 1.2.3.4 PROPFIND /remote.php/webdav - 443 - 1.2.3.6 Mozilla/5.0+(Windows)+csyncoC/0.91.5+neon/0.30.0 - 207 0 0 1578 2014-09-17 19:19:06 1.2.3.4 PROPFIND /remote.php/webdav/ADC/Source - 443 - 1.2.3.7 Mozilla/5.0+(Windows)+csyncoC/0.91.5+neon/0.30.0 - 207 0 0 3187 2014-09-17 19:19:09 1.2.3.4 PROPFIND /remote.php/webdav/GDMS - 443 - 1.2.3.6 Mozilla/5.0+(Windows)+csyncoC/0.91.5+neon/0.30.0 - 207 0 0 2515 2014-09-17 19:19:09 1.2.3.4 PROPFIND /remote.php/webdav/ADC/Releases - 443 - 1.2.3.7 Mozilla/5.0+(Windows)+csyncoC/0.91.5+neon/0.30.0 - 207 0 0 3265 2014-09-17 19:19:11 1.2.3.4 PROPFIND /remote.php/webdav/ - 443 - 1.2.3.6 Mozilla/5.0+(Windows)+mirall/1.6.3 - 207 0 0 1062 2014-09-17 19:19:11 1.2.3.4 PROPFIND /remote.php/webdav/GDMS/URLs - 443 - 1.2.3.6 Mozilla/5.0+(Windows)+csyncoC/0.91.5+neon/0.30.0 - 207 0 0 2781 2014-09-17 19:19:13 1.2.3.4 PROPFIND /remote.php/webdav/ADC/Docs - 443 - 1.2.3.7 Mozilla/5.0+(Windows)+csyncoC/0.91.5+neon/0.30.0 - 207 0 0 3234 2014-09-17 19:19:13 1.2.3.4 PROPFIND /remote.php/webdav/ - 443 - 1.2.3.5 Mozilla/5.0+(Windows)+mirall/1.6.3 - 207 0 0 796 2014-09-17 19:19:14 1.2.3.4 PROPFIND /remote.php/webdav/GDMS/Training - 443 - 1.2.3.6 Mozilla/5.0+(Windows)+csyncoC/0.91.5+neon/0.30.0 - 207 0 0 2671 2014-09-17 19:19:16 1.2.3.4 PROPFIND /remote.php/webdav/ADC/Docs/Workflows - 443 - 1.2.3.7 Mozilla/5.0+(Windows)+csyncoC/0.91.5+neon/0.30.0 - 207 0 0 3218 2014-09-17 19:19:16 1.2.3.4 PROPFIND /remote.php/webdav/GDMS/Tests - 443 - 1.2.3.6 Mozilla/5.0+(Windows)+csyncoC/0.91.5+neon/0.30.0 - 207 0 0 2859

ownCloud log (data/owncloud.log)

{"app":"PHP","message":"Cannot connect to apc.example.com. Error 10061. No connection could be made because the target machine actively refused it. at D:\sites\com\oc\3rdparty\phpseclib\phpseclib\phpseclib\Net\SSH2.php#875","level":3,"time":"2014-09-17T19:17:48+00:00"} {"app":"PHP","message":"unpack(): Type N: not enough input, need 4, have 0 at D:\sites\com\oc\3rdparty\phpseclib\phpseclib\phpseclib\Net\SSH2.php#3518","level":3,"time":"2014-09-17T19:17:48+00:00"} {"app":"PHP","message":"extract() expects parameter 1 to be array, boolean given at D:\sites\com\oc\3rdparty\phpseclib\phpseclib\phpseclib\Net\SSH2.php#3518","level":3,"time":"2014-09-17T19:17:48+00:00"} {"app":"PHP","message":"Undefined variable: length at D:\sites\com\oc\3rdparty\phpseclib\phpseclib\phpseclib\Net\SSH2.php#3519","level":3,"time":"2014-09-17T19:17:48+00:00"} {"app":"PHP","message":"Unsupported signature format at D:\sites\com\oc\3rdparty\phpseclib\phpseclib\phpseclib\Net\SSH2.php#3638","level":3,"time":"2014-09-17T19:17:48+00:00"} {"app":"core","message":"Login failed","level":3,"time":"2014-09-17T19:17:48+00:00"} {"app":"PHP","message":"Cannot connect to apc.example.com. Error 10061. No connection could be made because the target machine actively refused it. at D:\sites\com\oc\3rdparty\phpseclib\phpseclib\phpseclib\Net\SSH2.php#875","level":3,"time":"2014-09-17T19:17:49+00:00"} {"app":"PHP","message":"unpack(): Type N: not enough input, need 4, have 0 at D:\sites\com\oc\3rdparty\phpseclib\phpseclib\phpseclib\Net\SSH2.php#3518","level":3,"time":"2014-09-17T19:17:49+00:00"} {"app":"PHP","message":"extract() expects parameter 1 to be array, boolean given at D:\sites\com\oc\3rdparty\phpseclib\phpseclib\phpseclib\Net\SSH2.php#3518","level":3,"time":"2014-09-17T19:17:49+00:00"} {"app":"PHP","message":"Undefined variable: length at D:\sites\com\oc\3rdparty\phpseclib\phpseclib\phpseclib\Net\SSH2.php#3519","level":3,"time":"2014-09-17T19:17:49+00:00"} {"app":"PHP","message":"Unsupported signature format at D:\sites\com\oc\3rdparty\phpseclib\phpseclib\phpseclib\Net\SSH2.php#3638","level":3,"time":"2014-09-17T19:17:49+00:00"} {"app":"core","message":"Login failed","level":3,"time":"2014-09-17T19:17:49+00:00"} {"app":"PHP","message":"Cannot connect to apc.example.com. Error 10061. No connection could be made because the target machine actively refused it. at D:\sites\com\oc\3rdparty\phpseclib\phpseclib\phpseclib\Net\SSH2.php#875","level":3,"time":"2014-09-17T19:17:49+00:00"} {"app":"PHP","message":"unpack(): Type N: not enough input, need 4, have 0 at D:\sites\com\oc\3rdparty\phpseclib\phpseclib\phpseclib\Net\SSH2.php#3518","level":3,"time":"2014-09-17T19:17:49+00:00"} {"app":"PHP","message":"extract() expects parameter 1 to be array, boolean given at D:\sites\com\oc\3rdparty\phpseclib\phpseclib\phpseclib\Net\SSH2.php#3518","level":3,"time":"2014-09-17T19:17:49+00:00"} {"app":"PHP","message":"Undefined variable: length at D:\sites\com\oc\3rdparty\phpseclib\phpseclib\phpseclib\Net\SSH2.php#3519","level":3,"time":"2014-09-17T19:17:49+00:00"} {"app":"PHP","message":"Unsupported signature format at D:\sites\com\oc\3rdparty\phpseclib\phpseclib\phpseclib\Net\SSH2.php#3638","level":3,"time":"2014-09-17T19:17:49+00:00"} {"app":"core","message":"Login failed","level":3,"time":"2014-09-17T19:17:49+00:00"}

LukasReschke commented 10 years ago

@MTRichards Might be also important for EE considering SP et al.

MTRichards commented 10 years ago

@jnfrmarks can we test? @rperezb what happens in this situation for SP and network drives?

rperezb commented 10 years ago

Checked on Network Drive, it deppends on the connectivity check:

From ownCloud we are not deleting anything, the thing is that we are not showing any information due to the fact that we don´t have access to it.

screen shot 2014-09-18 at 15 21 17

MTRichards commented 10 years ago

I am most concerned about what happens to the desktop client. Do we delete the files? We should not, it should be a temporary error thrown...

Green9332 commented 10 years ago

Only good thing about this is the folders/files end up in Deleted Items and can therefore be restored at the OC server. But if you had a 100GB repository shared with 100 clients, OC will start deleting potentially 10TB of files on the clients - which has to be replicated again. Empty folders are definitely all deleted. As mine was a production system (in which I had to restart the 50TB RAID NAS for a software update), I did not sit idly by to wait until the damage was done. In fact, I watched it very very closely since I didn't factually know what will happen and expected something iffy will probably come of the situation.

So, I stopped the OC rather fast to prevent any further checking/replicating/propagation of the issue. What I could see in realtime was that the client also tried to delete the folders which contained files, but gave errors (most likely the only saving grace was that OC client tried to delete the folders first before deleting the contained files - thereby running into a "folder not empty" error. If not for that, I'm pretty sure the files would have been wiped eventually as well, and then the folders)

So, whilst there may not be an issue on the web interface for a disconnected store, you may also not have anything to show after a while if the OC has moved it all to Deleted Items. And/or your files will still be on the store (since it was offline during the OC process), but OC would have "moved" it's cached references to the Deleted Items.

Either way, it is a train smash waiting to happen.

RobinMcCorkell commented 10 years ago

Yes, this was noticed a while back in that issue I just linked. ownCloud currently does not differentiate between non-existent/broken storages and temporary failures. @icewind1991 what do you think, high severity?

PVince81 commented 10 years ago

I think @icewind1991 has added some code for server to server sharing where it's possible for mount points to have the state "currently not available". The same status/code could be reused for regular external storages, assuming that the clients should be able to understand that.

I believe it might return a 503 when propfinding that folder.

@icewind1991 can you confirm ?

icewind1991 commented 10 years ago

Yes, storages can throw a \OCP\Files\StorageNotAvailableException which will be handled correctly by the webinterface and sync client

rperezb commented 10 years ago

@MTRichards yes, if there is not connectivity with the mountpoint, on the desktop the data is "removed" from the desktop client. For us, it´s the same situation as your credentials have not been entered or have been modified. If we wanted not to be removed the data, we would have to change the plugins design; plus, check how the desktop client handles http 503.

MTRichards commented 10 years ago

Ok @rperezb As an enhancement, we should seek to use this:

Yes, storages can throw a \OCP\Files\StorageNotAvailableException which will be handled correctly by the webinterface and sync client

So that these files are not deleted just because the server goes down. In fact, in at least one use case (server goes down) the desktop is the only source for the files. If they get deleted, then this would be bad. So Yes, this needs to be put into that repo as an enhancement for SP and win network drives.

MTRichards commented 10 years ago

Also, to add, we need to consider a time based expiration. If it is offline for, say, 7 days it gets removed by the sync client. But we don't want the sync client removing things just if there is a network error that is highly temporary. Since we can't tell the difference, we should have a time window.

This should also be reflected in all of our external storage connectors...

Green9332 commented 10 years ago

@MTRichards - please do not make assumptions about the use cases of the world. A million things can happen. People can take a drive on vacation as a backup which just also happen to be an external storage for their OC (typical of individual users). A NAS can have a serious RAID crash requiring weeks of repairs, or waiting on suppliers for a this or that HDD to courier, etc.

From the OC perspective things are very simple, if it is configured but not accessible not OC nor OC Client deletes anything, ever. If a human comes to disable or remove a storage and a human therefore gave their blessing and approval, OC can start deleting. Simple as that. Why on earth should OC delete anything after any amount of time? What is the motivation? Is it OC's mandate to decide the fate of users' data automatically? Why 7 days and not 8 or 3? 7 days may work for you, whilst for someone else only 7mins will suffice and for another 7 months and for yet another 7 years. If you want to put a limit on it, then make it something the user can set per storage. But pretty please do not ever hardcode it.

Rather think of the use cases you can cater for if you made it customizable. Think for example of the pharmaceutical industry which has to keep records intact for 15 years, yet, must be able to reproduce the IT environment for the FDA in a heartbeat. How great would it be if OC allowed one to remove the storage device once a study/project was over, go put it into a safe, yet, be able to plug it back in after 5 years, just to have everyone's shared/access appear on the shares as if by magic exactly as it was the day it was removed. Now granted, in this specific case you probably would want the clients to delete the local files, yet you would not want OC to forget about (delete) the storage or configurations thereof.

If the user removes something either by mistake or because it is in for maintenance and forgot to remove his config, that then it is their problem. Only two problems in the world, your problem and my problem. Having a configured storage go offline for a minute or a day or a month is OC's problem to gracefully handle. Forgetting to remove a config of a storage is the user's problem and not OC's responsibility to make any assumptions or decisions about. That is what ITIL and IT SOPs are for.

What OC really needs is a proper UI checkbox to "temporarily" disable external storages with? Enabling OC to gracefully handle maintenance periods, etc. Whilst the fallback is to handle it gracefully in any event should the user forget to use the checkbox.

Lets look at the current alternatives to a simple checkbox: I can delete my stores "temporarily", but then I could lose all my shared settings on said storages, and most likely the clients will start to delete stuff? I could leave OC running, but at the moment it factually will start to delete content. I could stop OC altogether so the clients can't connect at all, but this can turn into a nightmare in High Availability and corporate environments where things must run 24/7, plus now I not only need to consider the external storage device requiring maintenance, but actually need a proper plan to shutdown things in sequence so I can do a few seconds maintenance on a single NAS. See where OC currently leaves the IT Department? Between a rock and a hard place. Doesn't matter what they do, it becomes a very risky and complex operation. Which could have been easily solved if a) each storage simply had a checkbox putting it either online or offline or b) if OC simply did nothing, ever, if it can't connect to either a local or an external storage. Or, if OC simply and automatically prevented client connections altogether during such maintenance periods. Or even better, if OC simply and automatically displayed "Temporarily Offline for Maintenance". Now that would not only be nifty, but solve a lot of issues pro-actively and leave the IT people without anything to worry or do anything about.

My 2.5c worth!

MTRichards commented 10 years ago

Hi @Green9332 - Great to hear from you!

The point is really just this - we can't delete files because ownCloud can't see external storage for some reason. We need to think of a mechanism where someone (admin) can decide what to do with it when it is offline. At the same time, it can't be so complex noone can figure out how to use it. As we add more and more connectors, this will become more and more important.

Thanks for taking the time to write this up.

Green9332 commented 10 years ago

@MTRichards pleasure, anything to help with the vision. I can live that, easy to understand, easy to implement, robust and not malignant. And best to consider it carefully now and do it properly, than to redev every few months.

PVince81 commented 10 years ago

Note: one prerequisites is that all storage backends properly return 503 when unavailable, this was raised here: https://github.com/owncloud/core/issues/11792

RobinMcCorkell commented 9 years ago

Closing in favour of #11785

jonyadamit commented 9 years ago

I have a network share mounted as a local folder, and configured as a local storage in OC. When that network share is not available, OC client deletes the entire folder (more than 50GB worth of data) and then downloads it again once it's available. Does this justify reopening this bug? Or since it's a local mount there's no way to tell if it was deleted or temporarily unavailable? I had too much problems with WebDav and would like to keep it as a local storage.

PVince81 commented 8 years ago

Reopening, I think this is not a duplicate of https://github.com/owncloud/core/issues/11785 as we need a more generic mechanism for any kind of storages.

PVince81 commented 8 years ago

The trouble with some mounts is that they might be on a local folder and the local folder appears suddenly empty when the mount is gone. One idea here is to have a mechanism similar to the sync client: whenever a storage that had files is suddenly empty, treat it as a potential error and throw StorageNotAvailableException.

PVince81 commented 8 years ago

The latter only applies when using the "Local" external storage pointing at a folder which itself is a Linux FS-level mount. If that mount is gone, the folder would appear empty to OC.

ownclouders commented 6 years ago

Hey, this issue has been closed because the label status/STALE is set and there were no updates for 7 days. Feel free to reopen this issue if you deem it appropriate.

(This is an automated comment from GitMate.io.)

ownclouders commented 6 years ago

Hey, this issue has been closed because the label status/STALE is set and there were no updates for 7 days. Feel free to reopen this issue if you deem it appropriate.

(This is an automated comment from GitMate.io.)

PVince81 commented 6 years ago

closing in favor of https://github.com/owncloud/core/issues/32552 which has more research